I think it's also the case that there are no true laws of which we can speak, but instead various interpretations of laws, though hopefully most people have agreeing interpretations, and much of the challenge within the legal profession is figuring out how to interpret laws.
Within common law (the only legal system I'm really familiar with), this issue of interpretation is actually a key aspect of the system. Judges serve the purpose of providing interpretations, and law is operationalized via decisions on cases that set precedents that can inform future in...
This is how I used to buy clothes. At least in my case I got some hard advice from a friend: I was picking pieces of clothes that were fine in isolation but didn't really come together to create a look/fit that was me and made me look unintentional and thus less good. It also made it too easy to optimize for function at the expense of form to the point of picking things that met great functional requirements but looked bad, like technical hiking pants that met tons of needs other than looking good or fitting my body well.
In order to actually look put together I realized that I needed to take a more global approach to my clothes optimization.
Actually, I kind of forgot what ended up in the paper, but then I remembered so wanted to update my comment.
There was an early draft of this paper that talked about deontology, but because there are so many different forms of deontology it was hard to come up with arguments where there wasn't some version of deontological reasoning that broke the argument, so I instead switched to talking about the question of moral facts independent of ethical system. That said, the argument I make in the paper suggesting that moral realism is more dangerous than moral an...
I sometimes literally have to say this in long threads. Sometimes in a thread of conversation, my interlocutor simple has too big an inferential gap for me to help them cross, and the kind but maybe not maximally nice thing to do is stop wasting both of our times. This happens for a variety of reasons, and being able to express something about it is useful.
In everyday conversation we have norms against this because they are status moves to shut down conversations, and taking such a move here does risk a status hit if others think you are making a gambit to give up a line of conversation that is proving you wrong, for example. But ultimately there's nothing in the reacts you can't just say with a comment.
I don't see it in the references so you might find this paper of mine (link is to Less Wrong summary, which links to full thing) interesting because within it I include an argument suggesting building AI that assumes deontology is strictly more risky than building one that does not.
I don't think there's anything wrong with presenting arguments that the orthogonality thesis might be false. However, if those arguments are poorly argued or just rehash previously argued points without adding anything new then they're likely to be downvoted.
I actually almost upvoted this because I want folks to discuss this topic, but ultimately downvoted because it doesn't actually engage in arguments that seem likely to convince anyone who believes the orthogonality thesis. It's mostly just pointing at a set of intuitions that cause surprise at the orth...
If it is the case that OpenAI is already capable of building a weakly general AI by this process, then I guess most of the remaining uncertainty lies in determining when it's worthwhile for them or someone like them to do it.
Also while I'm leaving feedback, I think there's too much nuance/overlap between some of the reactions. I think I'd prefer a smaller set that was something like:
Basically my theory is that reactions should be clearly personal reactions and stuff that can't be objected to (e.g. I can't object if you found my presentation overcomplicated, that's just how you felt about it), and anything that can be read as a bid to make claims should not be included because there's no easy way to respond to a reaction. I think on this grounds I also dislike the strawman and seems borderline reactions.
Neat. Looking at the list of reactions, one jumps out to me as out of place: the wrong reaction. The others reflect various feelings or perceptions and can be interpreted that way, but the wrong one seems too strong to me and overlaps with the existing agree/disagree voting. If you think something is wrong and want more than the disagree vote, seems like that's a case where we want to incentivize posting a reply rather than just leaving a wrong react with no explanation.
Basically my theory is that reactions should be clearly personal reactions and stuff that can't be objected to (e.g. I can't object if you found my presentation overcomplicated, that's just how you felt about it), and anything that can be read as a bid to make claims should not be included because there's no easy way to respond to a reaction. I think on this grounds I also dislike the strawman and seems borderline reactions.
- EA, for various cultural reasons, is a toxic brand in China. It's not any single component of EA, but rather the idea of altruism itself. Ask anyone who's lived in China for and few years and they will understand where I'm coming from. I think the best way forward for AI safety in China is to disassociate with EA. Rationality is more easily accepted, but spreading related ideas is not the most effective way to address AI safety in China.
I'd like to know more about this. What's the deal with altruism in China? Why is altruism disliked?
I like that offers a clearer theory of what boundaries are than most things I've read on the subject. I often find the idea of boundaries weird not because I don't understand that sometimes people need to put up social defenses of various kinds to feel safe but because I've not seen a very crisp definition of boundaries that didn't produce a type error. Framing in terms of bids for greater connect hits at a lot of what I think folks care about when they talk about setting boundaries, so it makes a lot more sense to me now than my previous understanding, which was more like "I'm going to be emotionally closed here because I can't handle being open" which is still kind of true but mixes in a lot of stuff and so is not a crisp notion.
I really like this idea, since in an important sense these are accident risks: we don't intend for AI to cause existential catastrophe but it might if we make mistakes (and we make mistakes by default). I get why some folks in the safety space might not like this framing because accidents imply there's some safe default path and accidents are deviations from that when in fact "accidents" are the default thing AI do and we have to thread a narrow path to get good outcomes, but seems like a reasonable way to move the conversation forward with the general pub...
To the point about using NVC for positive things to, as a manager I try to keep something like this in mind when giving feedback to reports, both to signal where they need to improve and to let them know when they're doing well. I picked up the idea from reading books about how to parent and teach kids, but the same ideas apply.
The big thing, as I think of it, is to avoid making fundamental attribution errors, or as I really think of it, don't treat your observations of behavior patterns as essential characteristics of a person. Both negative and positive ...
If the mind becomes much more capable than the surrounding minds, it does so by being on a trajectory of creativity: something about the mind implies that it generates understanding that is novel to the mind and its environment.
I don't really understand this claim enough to evaluate it. Can you expand a bit on what you mean by it? I'm unsure about the rest of the post because it's unclear to me what the premise your top-line claim rest upon means.
I appreciate the sentiment but I find something odd about expecting ontology to be backwards compatible. Sometimes there are big, insightful updates that reshape ontology. Those are sometimes not compatible with the old ontology, except insofar as both were attempting to model approximately the same reality. As an example, at some point in the past I thought of people has having character traits, now I think of character traits as patters I extract from observed behavior and not something the person has. The new ontology doesn't seem backwards compatible to me, except that it's describing the same reality.
This seems like a generalization of something that humans are also guilty of. The way we win against other animals also can look kind of dumb from the perspective of those animals.
Suppose you're a cheetah. The elegant, smart way to take down pray is chase them down in a rapid sprint. The best takedowns are ones where you artfully outmaneuver your prey and catch them right at a moment when they think they are successfully evading you.
Meanwhile you look on humans with disdain. They can take down the same prey as you, but they do it in dumb ways. Sometimes th...
My own guess here is that access to capital will become more important than it is today by an order of magnitude.
In the forager era capital barely mattered because almost value was created via labor. With no way to reliable accumulate capital, there was little opportunity to exploit it.
In the farmer era, capital became much more important, mainly in the form of useful land, but labor remained of paramount importance for generating value. If anything, capital made labor more valuable and thus demanded more of it.
In the in industrial era, capital became more...
So I have a vague theory as to why this might work. It's kind of nuts, but given that the product works for you at all is kind of nuts, so here we are.
During meditation many people will start to spontaneously rhythmically rock back and forth or side to side as they enter jhana states. This generally coincides with greater feelings of joy, contentment, and focus. I'm not sure why this happens, but I've seen lots of people do it and I do it myself. My best guess is this has something to do with brain wave harmonics.
My guess is that the vibrations of this dev...
Seems unlikely, both because I doubt the premise that an RO, whatever it looked like, would be significantly more or less trainable than IQ measurements (based on the fact that supposed measures of learnable knowledge like the SAT and GRE are so strongly correlated with IQ) and because if it had any measurement power it would, like the SAT and GRE, quickly become embroiled in politics due to disparities in outcomes between individuals and groups.
Related thought: because intelligence is often a proxy for status, calling someone or something dumb implies low status. This is why, for example, I think people get really worked up about IQ: being smart is not a matter of simple fact, but a matter of status assignment. As a society we effectively punish people for being dumb, and so naturally the 50% of the population that's below the mean has a strong incentive to fight back if you try to make explicit a way in which they may be treated as having lower status. Heck, it's worse than that: if you're not i...
As a manager (and sometimes middle manager) I've been thinking about how LLMs are going to change management. Not exactly your topic but close enough. Here's my raw thoughts so far:
To expand a bit, I think this post is confusing ontological and ontic existence, or in LW terms mixing up existence in the map and existence in the territory.
I don't think it makes sense to say that the symbol grounding problem has gone away, but I think it does make sense to say that we were wrong about what problems couldn't be solved without first solving symbol grounding. I also don't think we're really that confused about how symbols are grounded(1, 2, 3), although we don't yet have a clear demonstration of a system that has grounded its symbols in reality. GPTs do seem to be grounded by proxy through their training data, but this gives limited amounts of grounded reasoning today, as you note.
My guess would be that it'll be on the level of evals done internally by these companies today to make sure generative AI models don't say racist things or hand out bomb making instructions, etc.
Unclear to me how "serious" this really is. The US government has its hands in lots of things and spends money on lots of stuff. It's more serious than it was before, but to me this seems pretty close to the least they could be doing and not be seen as ignoring AI in ways that would be used against them in the next election cycle.
Appears to be a duplicate of https://www.lesswrong.com/posts/yBJftSnHcKAQkngzb/white-house-announces-new-actions-to-promote-responsible-ai
to answer my own question:
Level of AI risk concern: high
General level of risk tolerance in everyday life: low
Brief summary of what you do in AI: first tried to formalize what alignment would mean, this led me to work on a program of deconfusing human values that reached an end of what i could do, now have moved on to writing about epistemology that i think is critical to understand if we want to get alignment right
Anything weird about you: prone to anxiety, previously dealt with OCD, mostly cured it with meditation but still pops up sometimes
A form is not just a form. I have to also follow up to make sense of the responses, report back findings, etc. Possibly worth exploring if seems like there might be something there but not the effort I want to put in now. Answering here I can ignore this and others can still benefit if I do nothing else with the idea.
If I learn enough this way to suggest it's worth exploring and doing a real study, sure. This is a case of better done lazily to get some information than not done at all.
I think I disagree. Based on your presentation here, I think someone following a policy inspired by this post would be more likely to cause existential catastrophe by pursuing a promising false positive that actually destroys all future value in our Hubble volume. I've argued we need to focus on minimizing false positive risk rather than optimizing for max expected value, which is what I read this as proposing we do.
Thanks for your comment. I introduce the relative/absolute split in notions of truth in a previous chapter, so I expect readers of this chapter, as they progress through the book, to understand what it means.
I don't think this is a good idea. You and others may reasonably disagree, but here's my thinking:
Protests create an us vs. them mentality. Two groups are pitted against each other, with the protestors typically cast in the role of victims who are demanding to be heard.
I don't see this achieving the ends we need. If people push OpenAI to be for or against AI development, they are going to be for development. A protest, as I see it, risks making them dig in to a position and be less open to cooperating on safety efforts.
I'd rather see continued behind the s...
What? Nothing is in conflict you just took quotes out of context. The full sentence where Zvi commits is:
So unless circumstances change substantially – either Yann changes his views substantially, or Yann’s actions become far more important – I commit to not covering him further.
I'm doubtful of some of your examples of convergence, but despite that I think you are close to presenting an understanding why things like acausal trade might work.
Even if every one of your object level objections is likely to be right, this wouldn't shift me much in terms of policies I think we should pursue because the downside risks from TAI are astronomically large even at small probabilities (unless you discount all future and non-human life to 0). I see Eliezer as making arguments about the worst ways things could go wrong and why it's not guaranteed that they won't go that way. We could get lucky, but we shouldn't count on luck, so even if Eliezer is wrong he's wrong in ways that, if we adopt policies that acc...
I am reasonably sympathetic to this argument, and I agree that the difference between EY's p(doom) > 50% and my p(doom) of perhaps 5% to 10% doesn't obviously cash out into major policy differences.
I of course fully agree with EY/bostrom/others that AI is the dominant risk, we should be appropriately cautious, etc. This is more about why I find EY's specific classic doom argument to be uncompelling.
My own doom scenario is somewhat different and more subtle, but mostly beyond scope of this (fairly quick) summary essay.
This doesn't seem especially "global" to me then. Maybe another term would be better? Maybe this is a proximate/ultimate distinction?
I'm suspicious of your premise that evolution or anything is doing true global optimization. If the frame is the whole universe, all optimization is local optimization because of things like the speed of light limiting how fast information can propagate. Even if you restrict yourself to a Hubble volume this would still be the case. In essence, I'd argue all optimization is local optimization.
Appreciate you thinking about this question, but I also downvoted the post. Why? This is the kind of low effort, low context post that I don't want to see a lot of on LessWrong.
I think a good version of this question would have been if you presented some more context rather than just rehashing a quite old and well-worn topic without adding anything new. For example, if there's some new information specifically leading you to rethink the standard takes on this question making it worth reconsideration.
I don't have especially strong opinions about what to do here. But, for the curious, I've had run ins with both Said and Duncan on LW and elsewhere, so perhaps this is useful background information to folks outside the moderation team look at this who aren't already aware (I know they are aware of basically everything I have to say here because I've talked to some of them about these situations).
Also, before I say anything else, I've not had extensive bad interactions with either Said or Duncan recently. Maybe that's because I've been writing a book instea...
Sorry for the lack of links above.
I affirm the accuracy of Gordon's summary of our interactions; it feels fair and like a reasonable view on them.
For what it's worth, I think you're unusually uncomfortable with people doing this. I've not read the specific thread you're referring to, but I recall you expressing especially and unusually high dislike for other performing analysis of your mind and motivations.
I'm not sure what to do with this, only that I think it's important background for folks trying to understand the situation. Most people dislike psychoanalyzing to some extent, but you seem like P99 in your degree of dislike. And, yes, I realize that annoyingly my comment is treading in the direction of analyzing you, but I'm trying to keep it just to external observations.
I think this seems worth digging into. I've done my own digging, though not spent a lot of time thinking in detail about how it generalizes to minds unlike ours, although I think there's some general structure here that should generalize.
Hard agree. I think there's a tendency among folks to let fears of doom eat their minds in ways that make them give up without even trying. Some people give up outright. Others think they're trying to avert doom, but they've actually given up and are just trying because they're anxious and don't know what else to do, but they don't expect their attempt to work so they only make a show of it, not doing things they could do that stand a real chance of reducing the probability of doom.
I like your analogy to video games. I play DoTA (and have for a long time; ...
My comment will be vague because I'm not sure how much permission I have to share this or if it's been publicly said somewhere and I'm just unaware, but I talked to an AI researcher at one of the major companies/labs working on things like LLMs several years ago, before even GPT-1 was out, and they told me that your reason 10 was basically their whole reason for wanting to work on language models.
I can see you're taking a realist stance here. Let me see if I can take a different route that makes sense in terms of realism.
Let's suppose there are moral facts and some norms are true while others are false. An intelligent AI can then determine which norms are true. Great!
Now we still have a problem, though: our AI hasn't been programmed to follow true norms, only to discover them. Someone forgot to program that bit in. So now it knows what's true, but it's still going around doing bad things because no one made it care about following true norms.
This i...
Like it. Seems like another way of saying that sometimes what you really need is more dakka. Tagged the post as such to reflect that.