I'm a PhD student at the University of Amsterdam. I have research experience in multivariate information theory and equivariant deep learning and recently got very interested into AI alignment. https://langleon.github.io/
It's great to see Yoshua Bengio and other eminent AI scientists like Geoffrey Hinton actively engage in the discussion around AI alignment. He evidently put a lot of thought into this. There is a lot I agree with here.
Below, I'll discuss two points of disagreement or where I'm surprised by his takes, to highlight potential topics of discussion, e.g. if someone wants to engage directly with Bengio.
After filling out the form, I could click on "see previous responses", which allowed me to see the responses of all other people who have filled out the form so far.
That is probably not intended?
I disagree with this. I think the most useful definition of alignment is intent alignment. Humans are effectively intent-aligned on the goal to not kill all of humanity. They may still kill all of humanity, but that is not an alignment problem but a problem in capabilities: humans aren't capable of knowing which AI designs will be safe.
The same holds for intent-aligned AI systems that create unaligned successors.
For what it's worth, I think this comment seems clearly right to me, even if one thinks the post actually shows misalignment. I'm confused about the downvotes of this (5 net downvotes and 12 net disagree votes as of writing this).
Now to answer our big question from the previous section: I can find some satisfying the conditions exactly when all of the ’s are independent given the “perfectly redundant” information. In that case, I just set to be exactly the quantities conserved under the resampling process, i.e. the perfectly redundant information itself.
In the original post on redundant information, I didn't find a definition for the "quantities conserved under the resampling process". You name this F(X) in that post.
Just to be sure: is your claim that if F(X) exists that contains exactly the conserved quantities and nothing else, then you can define like this? Or is the claim even stronger and you think such can always be constructed?
Edit: Flagging that I now think this comment is confused. One can simply define as the conditional, which is a composition of the random variable and the function
When I converse with junior folks about what qualities they’re missing, they often focus on things like “not being smart enough” or “not being a genius” or “not having a PhD.” It’s interesting to notice differences between what junior folks think they’re missing & what mentors think they’re missing.
There may also be social reasons to give different answers depending on whether you are a mentor or mentee. I.e., answering "the better mentees were those who were smarter" seems like an uncomfortable thing to say, even if it's true.
(I do not want to say that this social explanation is the only reason that answers between mentors and mentees differed. But I do think that one should take it into account in one's models)
https://twitter.com/ai_risks/status/1664323278796898306?s=46&t=umU0Z29c0UEkNxkJx-0kaQ
Apparently Bill Gates signed.
Stating the obvious: Do we expect that Bill Gates will donate money to prevent the extinction from AI?