LW is a known hotbed of compatibilism, so here's my question:
That's not been my impression. I would have summarized it more as "LW (a) agrees that LFW doesn't exist and (b) understands that debating compatibilism doesn't make sense because it's just a matter of definition"
Personally, I certainly don't consider myself a compatibilist (though this is really just a matter of preference since there are no factual disagreements). My brief answer to "does free will exist" is "no". The longer answer is the within-physics stick figure drawing.
No one will hear my counter-arguments to Sabien's propaganda who does not ask me for them privately.
uh, why? Why not make a top level post?
Was about to reread this, but
UPDATE IN 2023: I wrote this a long time ago and you should NOT assume that I still agree with all or even most of what I wrote here. I’m keeping it posted as-is for historical interest.
... are your updated thoughts written up anywhere?
This argument is based on drawing an analogy between
in the sense that both have to get their values into a system. But the two situations are substantially disanalogous because the AI starts with a system that has its values already implemented. it can simply improve parts that are independent of its values. Doing this would be easier with a modular architecture, but it should be doable even without that. It's much easier to find parts of the system that don't affect values than it is to nail down exactly where the values are encoded.
Fair. Ok, I edited the original post, see there for the quote.
One reason I felt comfortable just stating the point is that Eliezer himself framed it as a wrong prediction. (And he actually refers to you as having been more correct, though I don't have the timestamp.)
(Eliezer did think neural nets wouldn't work; he explicitly said it on the Lex Fridman podcast.)
Edit @request from gwern: at 11:30 in the podcast, Eliezer says,
...back in the day I went around saying like, I do not think that just stacking more layers of transformers is going to get you all the way to AGI, and I think that GPT-4 is past where I thought this paradigm is going to take us, and I, you know, you want to notice when that happens, you want to say like "oops, I guess I was incorrect about what happens if you keep on stacking more transformer layers
I think you should quote the bit you think shows that. Which 'neural nets wouldn't work', exactly? I realize that everyone now thinks there's only one kind (the kind which works and which we have now), but there's not.
The Fridman transcript I skimmed was him being skeptical that deep learning, one of several different waves of connectionism, would go from early successes like AlphaGo all the way to AGI, and consistent with what I had always understood him to believe, which was that connectionism could work someday but that would be bad because it would be ...
Why AI would want to align us or end us is something I still haven't figured out after reading about alignment so much.
Has your reading ever included anything related to Instrumental Convergence?
Reporting on my mental state here: I'm emotionally opposed to the name change; I like "effective altruism". I don't want to do quantifiable good; I want to do effective good. Having x-risk and animal welfare in the same category makes perfect sense to me.
The commenter you're responding to mentioned physical and brute force, so I don't think the understanding of intelligence is the crux.
It is fascinating to learn about the extent to which AI technologies like GPT-4 and Copilot X have been integrated into the operations of LessWrong. It is understandable that the LW team wanted to keep this information confidential in order to prevent the potential negative consequences of revealing the economic value of AI.
However, with the information now out in the open, it's important to discuss the ethical implications of such a revelation. It could lead to increased investment in AI, which may or may not be a good thing, depending on how it is regula...
It's probably because GPT learns on the basis of tokens, not letters, so this doesn't really tell you much. If you want to find something it can't do, it'd be more impressive if it were a logic thing, not syntactic thing
My guess is that most people who downvoted think popular philosophy is unlikely to be relevant for no-nonsense applications like math and alignment
Half a year ago, we had the discussion about a lot of AI content. It since seems to have gotten more extreme; right now I count 10-3 of AI/xrisk - other stuff on the homepage (here coded as red and blue; I counted one post for both).
I know GPT-4 got just released, so maybe it's fine? Idk, but it really jumped out to me.
(He also did a quite high-effort thing in 2019 which did work. I don't know how well he kept the pounds off in the subsequent time)
I'm kinda confused why this is only mentioned in one answer, and in parentheses. Shouldn't this be the main answer -- like, hello, the premise is likely false? (Even if it's not epistemically likely, I feel like one should politely not assume that he since gained weight unless one has evidence for this.)
This doesn't seem quite right. The information content of agree vs. disagree depends on your prior, i.e., on . If that's <0.5, then an agree vote is more informative; if it's >0.5, then a disagree vote is more informative. But it's not obvious that it's <.5 in general.
I know he's talking about alignment, and I'm criticizing that extremely strong claim. This is the main thing I wanted to criticize in my comment! I think the reasoning he presents is not much supported by his publicly available arguments.
Ok, I don't disagree with this. I certainly didn't develop a gears-level understanding of why [building a brain-like thing with gradient descent on giant matrices] is doomed after reading the 2021 conversations. But that doesn't seem very informative either way; I didn't spend that much time trying to grok his arguments.
I assume you're asking if someone can query GPT-4 with this. if so, I did and here's the response.
I would agree with this if Eliezer had never properly engaged with critics, but he's done that extensively. I don't think there should be a norm that you have to engage with everyone, and "ok choose one point, I'll respond to that" seems like better than not engaging with it at all. (Would you have been more enraged if he hadn't commented anything?)
The problem is that the even if the model of Quintin Pope is wrong, there is other evidence that contradicts the AI doom premise that Eliezer ignores, and in this I believe it is a confirmation bias at work here.
Also, any issues with Quintin Pope's model is going to be subtle, not obvious, and it's a real difference to argue against good arguments + bad arguments from only bad arguments.
it is almost inevitable that we will be a tedious, frustrating and, shall we say - stubborn and uncooperative "partner" who will be unduly complicating the implementation of whatever solutions the AGI will be proposing.
It will, then, have to conclude that you "can't deal" very well with us, and we have a rather over-inflated sense of ourselves and our nature. And this might take various forms, from the innocuous, to the downright counter-productive.
This all seems to rely on anthropomorphizing the AI to me.
I think you're making the mistake of not clea...
I also don't really get your position. You say that,
[Eliezer] confidently dismisses ANNs
but you haven't shown this!
In Surface Analogies and Deep Causes, I read him as saying that neural networks don't automatically yield intelligence just because they share surface similarities with the brain. This is clearly true; at the very least, using token-prediction (which is a task for which (a) lots of training data exist and (b) lots of competence in many different domains is helpful) is a second requirement. If you take the network of GPT-4 and trained it
Responding to part of your comment:
In that quote, he only rules out a large class of modern approaches to alignment, which again is nothing new; he's been very vocal about how doomed he thinks alignment is in this paradigm.
I know he's talking about alignment, and I'm criticizing that extremely strong claim. This is the main thing I wanted to criticize in my comment! I think the reasoning he presents is not much supported by his publicly available arguments.
That claim seems to be advanced due to... there not being enough similarities between ANNs and human ...
If you mean how I accessed it at all, I used the official channel from OpenAI: https://chat.openai.com/chat
If you have a premium account (20$/month), you can switch to GPT-4 after starting a new chat.
I reject this terminology; I think #2 is superintelligence and #1 is a different dimension.
Also, I would actually differentiate two kinds of #1. There's how much stuff the AI can reason about, which is generality (you can have a "narrow superintelligence" like a chess engine), and there's how much it knows, which is knowledge base/resource access. But I wouldn't call either of them (super)intelligence.
This is pretty funny because the supposed board state has only 7 columns. Yet it's also much better than random. A lot of the pieces are correct... that is, if you count from the left (real board state is here).
Also, I've never heard of using upper and lowercase to differentiate white and black, I think GPT-4 just made that up. (edit: or not; see reply.)
Extra twist: I just asked a new GPT-4 instance whether any chess notation differentiates lower and upper case, and it told me algebraic notation does, but that's the standard notation, and it doesn't. Wikipedia article also says nothing about it. Very odd.
TAG said that Libertarian Free Will is the relevant one for Newcomb's problem. I think this is true. However, I strongly suspect that most people who write about decision theory, at least on LW, agree that LFW doesn't exist. So arguably almost the entire problem is about analyzing Newcomb's problem in a world without LFW. (And ofc, a big part of the work is not to decide which action is better, but to formalize procedures that output that action.)
This is why differentiating different forms of Free Will and calling that a "Complete Solution" is dubious. It...
I recently listened to Gary Marcus speak with Stuart Russell on the Sam Harris podcast (episode 312, "The Trouble With AI," released on March 7th, 2023). Gary and Stuart seem to believe that current machine learning techniques are insufficient for reaching AGI, and point to the recent adversarial attacks on KataGo as one example. Given this position, I would like Gary Marcus to come up with a new set of prompts that (a) make GPT-4 look dumb and (b) mostly continue to work for GPT-5.
While this is all true, it's worth pointing out that Stuart's position w...
I think a better question is, what does that mean? So many people throw around "GPT doesn't really understand xyz" as if there's a well-defined, unambiguous, universally-accepted notion of "understanding" that's also separate from performance. Perhaps this kind of understanding is a coherent concept, but it's not trivial!
I can't add this tag to existing posts -- doesn't show up in the tag list when I search for "Personal" or "Identity". What's going on?
ImE, obviously not? I don't have data, but general social interactions strongly suggest that smart people are nicer.
My model of how this works also suggests they would be nicer. Like, most people are nice to at least some people, so being not-nice is either due to a belief that being not-nice makes sense, or because of lack of self-control. Both of those are probably less common among smarter people. I don't think the correlation is super strong, but it's there.
Also, I don't think you defined the orthogonality thesis correctly. Afaik, Bostrom said that any combination of intelligence and goals is possible; this is not the same as saying that they're not correlated.
Alignment isn't required for high capability. So a self-improving AI wouldn't solve it because it has no reason to.
This becomes obvious if you think about alignment as "doing what humans want" or "pursuing the values of humanity". There's no reason why an AI would do this.
So Elon Musk's anti-woke OpenAI alternative sounds incredibly stupid on first glance since it implies that he thinks the AI's wokeness or anti-wokeness is the thing that matters.
But I think there's at least a chance that it may be less stupid than it sounds. He admits here that he may have accelerated AI research, that this may be a bad thing, and that AI should be regulated. And it's not that difficult to bring these two together; here are two ideas
note that you can do all this testing without publishing the post, by just saving it as a draft.
From what I understand, in Newcomb's Problem, you're sitting there at T2, confronted by Omega, never having thought about any of this stuff before (let's suppose). At that point you can come up with a decision algorithm.
With this sentence, you're again putting yourself outside the experiment; you get a model where you-the-person-in-the-experiment is one thing inside the experiment, and you-the-agent is another thing sitting outside, choosing what your brain does.
But it doesn't work that way. In the formalism, describes your entire brain. (Which is the...
Yeah, the ending was one of the most disappointing things I've ever read, alas. But the first two parts are strong.
Let's formalize it. Your decision procedure is some kind of algorithm that can do arbitrary computational steps but is deterministic. To model "arbitrary computational steps", let's just think of it as a function outputting an arbitrary string representing thoughts or computations or whatnot. The only input to is the time step since you don't receive any other information. So in the toy model, . Also your output must be such that it determines your decision, so we can define a predicate that takes your thoughts and looks wh...
Also, your babble should be aligned with your sense of strategy
I don't agree. (I think? I'm not sure what you mean by strategy.) I think your comment should try to track what's actually going on, not what you want to be going on.
Also note that I was trying to give an impression of what the median response could be. (Which was itself not very well thought out, but that was the attempt.) And the people who take the time to comment have very likely put more thought into their vote than the median, so even if they represented their reasons accurately, it'd still be a distorted picture.
I'm not making a normative claim. The factual claim I'd make is that complex explanations detailing several factual reasons are a bad answer to "why did my post get downvoted" because most people most of the time don't put anywhere near that much thought into their votes. (I also think votes on LW are way more meaningful than anywhere else on the internet, but I don't see a contradiction.)
Afaik the semi-official sequel to hpmor is Significant Digits, since Eliezer has endorsed that one (forgot where but I remember him saying it).
When exactly is ? Is it before or after Omega has decided on the contents of the box?
If it's before, then one-boxing is better. If it's after, then again you can't change anything here. You'll do whatever Omega predicted you'll do.
Note that you don't need to explicitly dislike anything about a post to downvote it; it's enough if you didn't think it added anything. There's lots of people posting stuff (I feel like it's gotten more in the past year, too?) and limited space on the front page.
Or, you know, you can just be annoyed with the tone and downvote it without thinking much. The fact that downvotes are free (for the user who gives them out) and anyonymous encourages that kind of behavior.
Personally, I'm very annoyed with the phrase "many worlds interpretation of wave function col...
Suppose that instead of you confronting Omega, I am confronting Omega, while you are watching the events from above, and you can choose to magically change my action to something different. That is, you press a button, and if you do, the neuron values in my brain get overridden such that I change my behavior from one-boxing to two-boxing. Nothing else changes, Omega already decided the contents of the boxes, so I walk away with more money.
This different problem is actually not different at all; it's isomorphic to how you've just framed the problem. You've ...
It was that general debate about content moderation. Pretty sure it wasn't all in the comments of that post (though that may have been the start); I don't remember the details. It's also possible that my recollection includes back and forth you had with [other people who defended my general position].
I think I should just add my own data point here, which is that Zack and I have been on polar opposites sites of a pretty emotional debate before, and I had zero complaints about their conduct. In fact ever since then, I think I'm more likely to click on a post if I see that Zack wrote it.
I think the claim here is supposed to be that if the principle works for bacteria, it can't tell you that much.[1] That's true for your laws of physics example as well; nothing is gained from taking the laws of physics as a starting point.
That said, this doesn't seem obviously true; I don't see why you can't have a principle that holds for every system yet tells you something very important. Maybe it's not likely, but doesn't seem impossible. ↩︎
I think you implied it by calling them assumptions in your first comment, and magical thinking in your second. Arguments you disagree with aren't really either of those things.
I don't get the downvotes, this post is just agreeing with the OP.