I'm assuming there are other people (I'm a person too, honest!) up in here asking this same question, but I haven't seen them so far, and I do see all these posts about AI "alignment" and I can't help but wonder: when did we discover an objective definition of "good"?
I've already mentioned it elsewhere here, but I think Nietzsche has some good (heh) thoughts about the nature of Good and Evil, and that they are subjective concepts. As ChatGPT has to say:
Nietzsche believed that good and evil are not fixed things, but rather something that people create in their minds. He thought that people create their own sense of what is good and what is bad, and that it changes depending on the culture and time period. He also believed that people often use the idea of "good and evil" to justify their own actions and to control others. So, in simple terms, Nietzsche believed that good and evil are not real things that exist on their own, but are instead created by people's thoughts and actions.
How does "alignment" differ? Is there a definition somewhere? From what I see, it's subjective. What is the real difference between "how to do X" and "how to prevent X"? One form is good and the other not— depending on what X is? But again, perhaps I misunderstand the goal, and what exactly is being proposed be controlled.
Is information itself good or bad? Or is it how the information is used that is good or bad (and as mentioned, relatively so)?
I do not know. I do know that I'm stoked about AI, as I have been since I was smol, and as I am about all the advancements us just-above-animals make. Biased for sure.
i'm pretty sure solving either will solve both, and that understanding this is key to solving either. these all are the same thing afaict:
it's all unavoidably the same stack of problems: how do you determine if a chunk of other-matter is in a shape which is safe and assistive for the self-matter's shape, according to consensus of self-matter? how can two agentic chunks of matter establish mutual honesty without getting used against their own preference by the other? how do you ensure mutual honesty or interaction is not generated when it is not clear that there is safety to be honest or interact? how do you ensure it does happen when it is needed? this sounds like an economics problem to me. seems to me like we need multi-type multiscale economic feedback, to track damage vs fuel vs repair-aid.
eg, on the individual/small group scale: https://www.microsolidarity.cc/