That’s true, but when we’re looking at natural languages that’s not a super useful perspective to take because natural languages care a lot about how information is packaged. From a linguistic point of view, predication is a way of packaging information with a topic-comment structure. The topic can be thought of as a ”discourse file” and the comment can be thought as information that is being added to that file. Reference can be thought of as opening a new file or accessing a pre-existing one.
I slightly mispoke earlier when I paired adjectives with “attri...
When we're talking about grammatical categories like "noun" and "verb", it is VERY important that we make a sharp distinction between language-internal categories and categories that are useful for making comparisons across languages. If you are trying to say that some languages don't have verbs, you better have a very clear criterion for what counts as a "verb" that can be applied to any language, otherwise you're just going to get yourself hopelessly confused.
The best way to define "noun", "adjective" and "verb" in a cross-linguistically valid way is ac...
The claim isn’t that minds are safe and nice by default. It’s that they’re not sociopaths.
If in your view, most humans are basically ruthless sociopaths, then that’s good news, isn’t it? Sociopathic AIs would fit in well to our culture. It would mean our laws and norms do a remarkably good job of restraining us, so there’d be hope they’d do the same for future AIs.
According to Google, the Xhosa currently number between 9.2 and 9.6 million people, and are the second-largest ethnic group in South Africa. Looks like they've done fine for themselves.
It's weird you would use them as an intuition pump for human extinction being plausible, as if they were some extinct tribe.
Maybe you'll think I'm missing the main point of the essay, which is about what happens when memes or viruses jump species, but I don't think that works either. This argument proves too much. You could just as well say that since most ideas in the wor...
I understand that when a person feels a lot is on the line it is often hard for that person to not come across as sanctimonious. Maybe it’s unfair of me, but that is how this comes across to me. Eg “people who allegedly care”.
Death with Dignity:
...
>Q2: I have a clever scheme for saving the world! I should act as if I believe it will work and save everyone, right, even if there's arguments that it's almost certainly misguided and doomed? Because if those arguments are correct and my scheme can't work, we're all dead anyways, right?A:
>Do you like cinnamon when it's not combined with sugar? If not, is it really cinnamon per se that you like?
How do you feel about butter
The order of our numeral notation mirrors the order of our spoken numerals. I’m not sure if there are any languages that consistently order additive numerals from smallest to largest - “two and fifty and three hundred” instead of “three hundred and fifty two”.
>💡Flipping the local ordering of pronunciation: If we're truly optimizing, we might as well say "twenty and hundred-three" while we're at it. The first words "and three-" don't tell you much until you know "three of what"? Whereas "and hundred-three" tells you the order of magnitude as soon ...
I don’t think perfect surveillance is inevitable.
I would prefer it, though. I don’t know any other way to prevent people from doing horrible things to minds running on their computers. It wouldn’t need to be publicly broadcast though, just overseen by law enforcement. I think this is much more likely than a scenario where everything you see is shared with everyone else.
Unfortunately, my mainline prediction is that people will actually be given very strong privacy rights, and will be allowed to inflict as much torture on digital minds under their control as they want. I’m not too confident in this though.
Basically people tend to value stuff they perceive in the biophysical environment and stuff they learn about through the social environment.
So that reduces the complexity of the problem - it’s not a matter of designing a learning algorithm that both derives and comes to value human abstractions from observations of gas particles or whatever. That’s not what humans do either.
Okay then, why aren’t we star-maximizers or number-of-nation-states maximizers? Obviously it’s not just a matter of learning about the concept. The details of how we get values hooked up to an AGI’s motivations will depend on the particular AGI design but probably reward, prompting, scaffolding or the like.
I don’t think the way you split things up into Alpha and Beta quite carves things at the joints. If you take an individual human as Beta, then stuff like “eudaimonia” is in Alpha - it’s a concept in the cultural environment that we get exposed to and sometimes come to value. The vast majority of an individual human’s values are not new abstractions that we develop over the course of our training process (for most people at least).
There is a difference between the claim that powerful agents are approximately well-described as being expected utility maximizers (which may or may not be true) and the claim that AGI systems will have an explicit utility function the moment they’re turned on, and maximize that function from that moment on.
I think this is the assumption OP is pointing out: “most of the book's discussion of AI risk frames the AI as having a certain set of goals from the moment it's turned on, and ruthlessly pursuing those to the best of its ability”. “From the moment it’s turned on” is pretty important, because it rules out value learning as a solution
Edit: Retracted because some of my exegesis of the historical seed AI concept may not be accurate
There will be future superintelligent AIs that improve themselves. But they will be neural networks, they will at the very least start out as a compute-intensive project, in the infant stages of their self-improvement cycles they will understand and be motivated by human concepts rather than being dumb specialized systems that are only good for bootstrapping themselves to superintelligence.
How does the question of whether AI outcomes are more predictable than AI trajectories reduce to the (vague) question of whether observations on current AIs generalize to future AIs?
To be blunt, it's not just that Eliezer lacks a positive track record in predicting the nature of AI progress, which might be forgivable if we thought he had really good intuitions about this domain. Empiricism isn't everything, theoretical arguments are important too and shouldn't be dismissed. But-
Eliezer thought AGI would be developed from a recursively self-improving seed AI coded up by a small group, "brain in a box in a basement" style. He dismissed and mocked connectionist approaches to building AI. His writings repeatedly downplayed the importance ...
Due partly to the choice of using 'value' as a speaker dependent variable, some of the terminology used in this article doesn't align with how the terms are used by professional metaethicists. I would strongly suggest one of:
1) replacing the phrase "moral internalism" with a new phrase that better individuates the concept.
2) including a note that the phrase is being used extremely non-standardly.
3) adding a section explaining the layout of metaethical possibilities, using moral internalism in the sense intended by professional metaethicists.
In metaethics, ...
I have a few complaints/questions:
1) "What is goodness made out of" is not really a particularly active discussion in professional philosophy. I feel that this was put in there just to make analytic philosophers look silly. And anyways, if one believes in naturalistic moral properties (the stuff that we value,) then "what is goodness made out of" really is the question "what is good," which I think is probably a fine question. In this case, rephrasing in terms of AI just makes philosophical discussions more wordy and less accessible.
2) "Faced with any phil...
The section on Moral Internalism is slightly inaccurate, or at least misleading. Internalism is the metaethical view that an agent can not judge something to be right and yet still not be the least bit motivated to perform it. As such, it is really a semantic claim about the meaning of moral vocabulary: whether or not it is part of the meaning of "that is right" or "that is wrong" that the speaker approves or disapproves respectively of an action. Internalism, then, (as intended by analytic philosophers,) is totally compatible with the Orthogonality Thesis...
If the choice was between Claude - and you personally - “growing up“ and becoming master of the universe, then I understand why it being Claude instead of you might be upsetting. But if the choice is between Claude and humanity as a whole, I don’t see why you’re so sure humanity is the better choice here.