Wiki Contributions

Comments

I’m confused. I know that it is like something to be me (this is in some sense the only thing I know for sure). It seems like there rules which shape the things I experience, and some of those rules can be studied (like the laws of physics). We are good enough at understanding some of these rules to predict certain systems with a high degree of accuracy, like how an asteroid will orbit a star or how electrons will be pushed through a wire by a particular voltage in a circuit. But I have no way to know or predict if it is like something to be a fish or GPT-4. I know that physical alterations to my brain seem to affect my experience, so it seems like there is a mapping from physical matter to experiences. I do not know precisely what this mapping is, and this indeed seems like a hard problem. In what sense do you disagree with my framing here?

Oh good catch, I missed that. Thanks!

The only metric natural selection is “optimizing” for is inclusive genetic fitness. It did not “try” to align humans with social status, and in many cases people care about social status to the detriment of their inclusive genetic fitness. This is a failure of alignment, not a success.

Answer by EccentricityFeb 27, 202430

I am not so sure it will be possible to extract useful work towards solving alignment out of systems we do not already know how to carefully steer. I think that substantial progress on alignment is necessary before we know how to build things that actually want to help us advance the science. Even if we built something tomorrow that was in principle smart enough to do good alignment research, I am concerned we don’t know how to make it actually do that rather than, say, imitate more plausible-sounding but incorrect ideas. The fact that appending silly phrases like “I’ll tip $200” improves the probability of receiving correct code from current LLMs indicates to me that we haven’t succeeded at aligning them to maximally want to produce correct code when they are capable of doing so.

How does Harry know the name “Lucius Malfoy”?

We aren’t surprised by HHTHHTTTHT or whatever because we perceive it as the event “a sequence containing a similar number of heads and tails in any order, ideally without a long subsequence of H or T”, which occurs frequently.

I’m enjoying this series, and look forward to the next installment.

The thing I mean by “superintelligence” is very different from a government. A government cannot design nanotechnology, and is made of humans which value human things.

What can men do against such reckless indifference?

Can someone with more knowledge give me a sense of how new this idea is, and guess at the probability that it is onto something?

Load More