All of deepthoughtlife's Comments + Replies

No. That's a foolish interpretation of domain insight. We have a massive number of highly general strategies that nonetheless work better for some things than others. A domain insight is simply some kind of understanding involving the domain being put to use. Something as simple as whether to use a linked list or an array can use a minor domain insight. Whether to use a monte carlo search or a depth limited search and so one are definitely insights. Most advances in AI to this point have in fact been based on domain insights, and only a small amount on sca... (read more)

2quanticle9mo
And now we have the same algorithms that were used to conquer Go and chess being used to conquer matrix multiplication [https://arstechnica.com/information-technology/2022/10/deepmind-breaks-50-year-math-record-using-ai-new-record-falls-a-week-later/]. Are you still sure that AlphaZero is "domain specific"? And if so, what definition of "domain" covers board games, Atari video games [https://www.deepmind.com/blog/muzero-mastering-go-chess-shogi-and-atari-without-rules], and matrix multiplication? At what point does the "domain" in question just become, "Thinking?"

I do agree with your rephrasing. That is exactly what I mean (though with a different emphasis.).

I agree with you. The biggest leap was going to human generality level for intelligence. Humanity already is a number of superintelligences working in cooperation and conflict with each other; that's what a culture is. See also corporations and governments. Science too. This is a subculture of science worrying that it is superintelligent enough to create a 'God' superintelligence.

To be slightly uncharitable, the reason to assume otherwise is fear -either their own or to play on that of others. Throughout history people have looked for reasons why civilizat... (read more)

Honestly Illusionism is just really hard to take seriously. Whatever consciousness is, I have better evidence it exists than anything else since it is the only thing I actually experience directly. I should pretend it isn't real...why exactly? Am I talking to slightly defective P-zombies?


If the computer emitted it for the same reasons...is a clear example of a begging the question fallacy. If a computer claimed to be conscious because it was conscious, then it logically has to be conscious, but that is the possible dispute in the first place. If you claim ... (read more)

-1Steven Byrnes10mo
In an out-of-body experience, you can “directly experience” your mind floating on the other side of the room. But your mind is not in fact floating on the other side of the room. So what you call a “direct experience”, I call a “perception”. And perceptions can be mistaken—e.g. optical illusions. So, write down a bulleted list of properties of your own consciousness. Every one of the items on your list is a perception that you have made about your own consciousness. How many of those bulleted items are veridical perceptions—perceiving an aspect of your own consciousness as it truly is—and how many of them are misperceptions? If you say “none is a misperception”, how do you know, and why does it differ from all other types of human perception in that respect, and how do you make sense of the fact that some people report that they were previously mistaken about properties of their own consciousness (e.g. “enlightened” Buddhists reflecting on their old beliefs)? Or if you allow that some of the items on your bulleted list may be misperceptions, why not all of them?? To be clear, this post is about AGI, which doesn’t exist yet, not “modern AI”, which does.

As individuals, Humans routinely do things much too hard for them to fully understand successfully. This is due partly due to innately hardcoded stuff (mostly for things we think are simple like vision and controlling our bodies automatic systems), and somewhat due to innate personality, but mostly due to the training process our culture puts us through (for everything else).

For its part, cultures can take the inputs of millions to hundreds of millions of people (or even more when stealing from other cultures), and distill them into both insights and pract... (read more)

I'm hardly missing the point. It isn't impressive to have it be exactly 75%, not more or less, so the fact that it can't always be that is irrelevant. His point isn't that that particular exact number matters, it's that the number eventually becomes very small.  But since the number being very small compared to what it should be does not prevent it from being made smaller by the same ratio, his point is meaningless. It isn't impressive to fulfill an obvious bias toward updating in a certain direction.

It doesn't take many people to cause these effects. If we make them 'the way', following them doesn't take an extremist, just someone trying to make the world better, or some maximizer. Both these types are plenty common, and don't have to make it fanatical at all. The maximizer could just be a small band of petty bureaucrats who happen to have power over the area in question. Each one of them just does their role, with a knowledge that it is to prevent overall suffering. These aren't even the kind of bureaucrats we usually dislike! They are also monsters, because the system has terrible (and knowable) side effects.

I don't have much time, so:

While footnote 17 can be read as applying, it isn't very specific.

For all that you are doing math, this isn't mathematics, so base needs to be specified.

I am convinced that people really do give occasional others a negative weight.

And here are some notes I wrote while finishing the piece (that I would have edited and tightened up a a lot)(it's a bit all over the place):

This model obviously assumes utilitarianism.
Honestly, their math does seem reasonable to account for people caring about other people (as long as they care about t... (read more)

1Kaarel10mo
It seems to me that this matters in case your metaethical view is that one should do pCEV, or more generally if you think matching pCEV is evidence of moral correctness. If you don't hold such metaethical views, then I might agree that (at least in the instrumentally rational sense, at least conditional on not holding any metametalevel views that contradict these) you shouldn't care. > Why is the first example explaining why someone could support taking money from people you value less to give to other people, while not supporting doing so with your own money? It's obviously true under utilitarianism I'm not sure if it answers the question, but I think it's a cool consideration. I think most people are close to acting weighted-utilitarianly, but few realize how strong the difference between public and private charity is according to weighted-utilitarianism. > It's weird to bring up having kids vs. abortion and then not take a position on the latter. (Of course, people will be pissed at you for taking a position too.) My position is "subsidize having children, that's all the regulation around abortion that's needed". So in particular, abortion should be legal at any time. (I intended what I wrote in the post to communicate this, but maybe I didn't do a good job.) > democracy plans for right now I'm not sure I understand in what sense you mean this? Voters are voting according to preferences that partially involve caring about future selves. If what you have in mind is something like people being less attentive about costs policies cause 10 years into the future and this leads to discounting these more than the discount from caring alone, then I guess I could see that being possible. But that could also happen for people's individual decisions, I think? I guess one might argue that people are more aware about long-term costs of personal decisions than of policies, but this is not clear to me, especially with more analysis going into policy decisions. > As to yo

I'm only a bit of the way in, and it is interesting so far, but it already shows signs of needing serious editing, and there are other ways it is clearly wrong too.

In 'The inequivalence of society-level and individual charity' they list the scenarios as 1, 1, and 2 instead of A, B, C, as they later use. Later, refers incorrectly to preferring C to A with different necessary weights when the second reference is is to prefer C to B.

The claim that money becomes utility as a log of the amount of money isn't true, but is probably close enough for this kind of u... (read more)

1Kaarel1y
Thanks for the comments! I agree and I published an edit fixing this just now I mostly agree, but I think footnote 17 covers this? I think the standard in academic mathematics is that logx:=logex, https://en.wikipedia.org/wiki/Natural_logarithm#Notational_conventions [https://en.wikipedia.org/wiki/Natural_logarithm#Notational_conventions,], and I guess I would sort of like to spread that standard :). I think it's exceedingly rare for someone to mean base 10 in this context, but I could be wrong. I agree that base 2 is also reasonable though. In any case, the base only changes utility by scaling by a constant, so everything in that subsection after the derivative should be true independently of the base. Nevertheless, I'm adding a footnote specifying this. I'm having a really hard time imagining thinking this about someone else (I can imagine hate in the sense of like... not wanting to spend time together with someone and/or assigning a close-to-zero weight), but I'm not sure – I mean, I agree there definitely are people who think they non-instrumentally want the people who killed their family or whatever to suffer, but I think that's a mistake? That said, I think I agree that for the purposes of modeling people, we might want to let weights be negative sometimes. I think it's partly that I just wanted to have some shorthand for "assign equal weight to everyone", but I also think it matches the commonsense notion of being perfectly altruistic. One argument for this is that 1) one should always assign a higher weight for oneself than for anyone else (also see footnote 12 here) and 2) if one assigns a lower weight to someone else, then one is not perfectly altruistic in interactions with that person – given this, the unique option is to assign equal weight to everyone.

I strongly disagree. It would be very easy for a non-omnipotent, unpopular, government that has limited knowledge of the future, that will be overthrown in twenty years to do a hell of a lot of damage with negative utilitarianism, or  any other imperfect utilitarianism. On a smaller scale, even individuals could do it alone.

A negative utilitarian could easily judge that something that had the side effect of making people infertile would cause far less suffering than not doing it, causing immense real world suffering amongst the people who wanted to ha... (read more)

1George3d61y
  But you're thinking of people completely dedicated to an ideology. That's why I'm saying a "negative utilitarian charter" rather than "a government formed of people autistically following a philosophy"... much like, e.g. the US government has a "liberal democratic" charter, or the USSR had a "communist" charter of sorts. In practice these things don't come about because member in the organization disagree, secret leak, conspiracies are throttled by lack of consensus, politicians voted out, engineered solutions imperfect (and good engineers and scinetists are aware of as much)

A lot of this depends on your definition of doomsday/apocalypse. I took it to mean the end of humanity, and a state of the world we consider worse than our continued existence. If we valued the actual end state of the world more than continuing to exist, it would be easy to argue it was a good thing, and not a doom at all. (I don't think the second condition is likely to come up for a very long time as a reason for something to not be doomsday.) For instance, if each person created a sapient race of progeny that weren't human, but they valued as their own ... (read more)

Interactionism would simply require an extension of physics to include the interaction between the two, which would not defy physics any more than adding the strong nuclear force did. You can hold against it that we do not know how it works, but that's a weak point because there are many things where we still don't know how they work.

Epiphenomenalism seems irrelevant to me since it is simply a way you could posit things to be. A normal dualist ignores the idea because there is no reason to posit it. We can obviously see how consciousness has effects on the... (read more)

1TAG1y
It would be a problem if all the existing forces fully explain everything,IE closure. If you do have closure , and you don't have overdetermination, then you get eiphenomenalism whether you want it or not. I partly agree. I don't see how closure can be proven without proving determinism.

If they didn't accept physical stuff as being (at least potentially) equal to consciousness they actually wouldn't be a dualist. Both are considered real things, and though many have less confidence in the physical world, they still believe in it as a separate thing. (Cartesian dualists do have the least faith in the real world, but even they believe you can make real statements about it as a separate thing.) Otherwise, they would be a 'monist'. The 'dual' is in the name for a reason. 

2Slider1y
To me it seems that interaction -> really one connected substance -> monism no interaction -> separated islands -> triviality If mind is "all we have proof of" then why believe in the unproofed parts? Is there some kind of "indirect" evidence for matter? Experiences of Azeroth are real but Azeroth is not real. How could we tell whether we have experiences of physics and on top of that physics being real? In this context "real world" is very loaded as we are arguing which parts are real and which illusory or unreal.

This is clearly correct. We know the world through our observations, which clearly occur within our consciousness, and are thus at least equally proving our consciousness. When something is being observed, you can assume that the something else doing the observations must exist. If my consciousness observes the world, my consciousness exists. If my consciousness observes itself, my consciousness exists. If my consciousness is viewing only hallucinations, it still exists for that reason. I disagree with Descartes, but 'I think therefore I am' is true of logical necessity.

I do not like immaterialism personally, but it is more logically defensible that illusionism.

1Shiroe1y
Refuting your illusionism about your own experiences is very easy; all that you have to do is look at your hands. If that can be denied by some razor, then so can all of science and mathematics as well.

The description and rejection given of dualism are both very weak. Also, dualism is a much broader group of models than is admitted here.


The fact is, we only have direct evidence of the mind, and everything else is just an attempt to explain certain regularities. An inability to imagine that the mind could be all that exists is clearly just a willful denial, and not evidence, but notably, dualism does not require nor even suggest that the mind is all there is, just that it is all we have proof of (even in the cartesian variant). Thus, dualism.

Your personal... (read more)

1TAG1y
Yep Well, there are some pretty difficult issues around causal closure, interactionism, and eiphenomenalism.
2Slider1y
With primacy of the direct observation the "conciousness stuff" stands pretty firm but I don't see why a dualist would be compelled to think that matter would be a fundamental thing. After all its a pattern in experience so why this "pattern" should be promoted to a substance? How would one be able to tell whether matter is the same "conciousness stuff" in a different form? (and why does not this principle lead to split substance matter to radiation and baryonic matter into actually being two separate substances?)

I was replying to someone asking why it isn't 2-5 years. I wasn't making an actual timeline. In another post elsewhere on the sight, I mention that they could give memory to a system now and it would be able to write a novel.

Without doing so, we obviously can't tell how much planning they would be capable of if we did, but current models don't make choices, and thus can only be scary for whatever people use them for, and their capabilities are quite limited.

I do believe that there is nothing inherently stopping the capabilities researchers from switching o... (read more)

You're assuming that it would make sense to have a globally learning model, one constantly still training, when that drastically increases the cost of running the model over present approaches. Cost is already prohibitive, and to reach that many parameters any time soon exorbitant (but that will probably happen eventually). Plus, the sheer amount of data necessary for such a large one is crazy, and you aren't getting much data per interaction. Note that Chinchilla recently showed that lack of data is a much bigger issue right now for models than lack of pa... (read more)

2Nathan Helm-Burger1y
I agree that current models seem to be missing some critical pieces (thank goodness!). I think perhaps you might be overestimating how hard it will be to add in those missing pieces if the capabilities research community focuses their primary intention on them. My guess is it'd be more like 5-10 years than 20-30 years.

You're assuming that the updates are mathematical and unbiased, which is the opposite of how people actually work. If your updates are highly biased, it is very easy to just make large updates in that direction any time new evidence shows up. As you get more sure of yourself, these updates start getting larger and larger rather than smaller as they should.

0[comment deleted]1y

That sort of strategy only works if you can get everyone to coordinate around it, and if you can do that, you could probably just get them to coordinate on doing the right things. I don't know if HR would listen to you if you brought your concerns directly to them, but they probably aren't harder to persuade on that sort of thing than convincing the rest of your fellows to defy HR. (Which is just a guess.) In cases where you can't get others to coordinate on it, you are just defecting against the group, to your own personal loss. This doesn't seem like a g... (read more)

That does sound problematic for his views if he actually holds these positions. I am not really familiar with him, even though he did write the textbook for my class on AI (third edition) back when I was in college. At that point, there wasn't much on the now current techniques and I don't remember him talking about this sort of thing (though we might simply have skipped such a section).

You could consider it that we have preferences on our preferences too. It's a bit too self-referential, but that's actually a key part of being a person. You could determin... (read more)

If something is capable of fulfilling human preferences in its actions, and you can convince it to do so, you're already most of the way to getting it to do things humans will judge as positive. Then you only need to specify which preferences are to be considered good in an equally compelling manner. This is obviously a matter of much debate, but it's an arena we know a lot about operating in. We teach children these things all the time.

1Netcentrica1y
Stuart does say something along the same lines that you point out in a later chapter however I felt it detracted from his idea of three principles:      1. The machine's only objective is to maximize the realization of human preferences.    2. The machine is initially uncertain about what those preferences are.    3. The ultimate source of information about human preferences is human behavior. He goes on at such length to qualify and add special cases that the word “ultimate” in principle #3 seems to have been a poor choice because it becomes so watered down as to lose its authority. If things like laws, ethics and morality are used to constrain what AI learns from preferences (which seems both sensible and necessary as in the parent/child example you provide) then I don’t see how preferences are “the ultimate source of information” but rather simply one of many training streams. I don’t see that his point #3 itself deals with the issue of evil. As you point out this whole area is “a matter of much debate” and I’m pretty confident that like philosophical discussions it will go on (as they should) forever however I am not entirely confident that Stuart’s model won’t end up having the same fate as Marvin Minsky’s “Society Of Mind”.
8Archimedes1y
He doesn’t want to give up but doesn’t expect to succeed either. The remaining option is “Dying with Dignity” by fighting for survival in the face of approaching doom.
2GuySrinivasan1y
My point was that (0.25)^n for large n is very small, so no, it would not be easy.

If they chose to design it with effective long term memory, and a focus on novels, (especially prompting via summary) maybe it could write some? They wouldn't be human level, but people would be interested enough in novels on a whim to match some exact scenario that it could be valuable. It would also be good evidence of advancement, since that is a huge current weakness (the losing track of things.).

But wouldn't that be easy? He seems to take every little advancement as a big deal.

5GuySrinivasan1y
How many times do you think he has changed his expected time to disaster to 25% of what it was?

I would like to point out that what johnswentworth said about being able to turn off an internal monologue is completely true for me as well. My internal monologue turns itself on and off several (possibly many) times a day when I don't control it, and it is also quite easy to tell it which way to go on that. I don't seem to be particularly more or less capable with it on or off, except on  a very limited number of tasks. Simple tasks are easier without it, while explicit reasoning and storytelling are easier with it. I think my default is off when I'm not worried (but I do an awful lot of intentional verbal daydreaming and reasoning about how I'm thinking too.).

So the example given to decry a hypothetical, obviously bad situation applies even better to what they're proposing. It's every bit the same coercion as they're decrying, but with less personal benefit and choice (you get nothing out of this deal.). And they admit this?  This is self-refuting.

Security agencies don't have any more reason to compete on quality than countries do, it's actually less, because they have every bit as much force, and you don't really have any say. What, you're in the middle of a million people with company A security, and you think you can pick B and they'll be able to do anything?

Except that is clearly not real anarchy. It is a balance of power between the states. The states themselves ARE the security forces in this proposal. I'm saying that they would conquer everyone who doesn't belong to one.

-1arabaga1y
Yes, anarcho-capitalists accept that ~everyone will hire a security agency. This isn't a refutation of anarchism. The point is that security agencies have incentive to compete on quality, whereas current governments don't (as much), so the quality of security agencies would be higher than the quality of governments today.

Anarchists always miss the argument from logical necessity, which I won't actually make because it is too much effort, but in summary, politics abhors a vacuum. If there is not a formal power you must consent to, there will be an informal one. If there isn't an informal one, you will shortly be conquered.

In these proposals, what is to stop these security forces from simply conquering anyone and everyone that isn't under the protection of one? Nothing. Security forces have no reason to fight each other to protect your right not to belong to one. And they wi... (read more)

3Arjun Panickssery1y
Channeling Huemer, I'd say that the world's states are in a kind of anarchy and they don't simply gobble each other up all the time. 

The examples used don't really seem to fit with that though. Blind signatures are things many/most people haven't heard of, and not how things are done; I freely admit I had never heard of them before the example. Your HR department probably shouldn't be expected to be aware of all the various things they could do, as they are ordinary people. Even if they knew what blind signatures were, that doesn't mean it is obvious they should use them, or how to do so even if they thought they should (which you admit). After reading the Wikipedia article, that doesn'... (read more)

1jchan1y
If the cryptography example is too distracting, we could instead imagine a non-cryptographic means to the same end, e.g. printing the surveys on leaflets which the employees stuff into envelopes and drop into a raffle tumbler [https://m.media-amazon.com/images/I/714RtULM8WL._AC_UL640_QL65_.jpg]. The point remains, however, because (just as with the blinded signatures) this method of conducting a survey is very much outside-the-norm, and it would be a drastic world-modeling failure to assume that the HR department actually considered the raffle-tumbler method but decided against it because they secretly do want to deanonymize the surveys. Much more likely is that they simply never considered the option. But if employees did start adopting the rule "don't trust the anonymity of surveys that aren't conducted via raffle tumbler", even though this is epistemically irrational at first, it would eventually compel HR departments to start using the tumbler method, whereupon the odd surveys that still are being conducted by email will stick out, and it would now be rational to mistrust them. In short, the Adversarial Argument is "irrational" but creates the conditions for its own rationality, which is why I describe it as an "acausal negotiation tactic".

I am aware of the excuses used to define it as not hearsay, even though it is clearly the same as all other cases of such. Society simply believes it is a valuable enough scenario that it should be included, even though it is still weak evidence.

2JBlack1y
Whether something is hearsay is relative to the proposition in question. When Charlie testifies that Bob said that he saw Alice at the club, that's hearsay when trying to establish whether Alice was at the club, or was alive at all, or any other facts about Alice. Charlie is not conveying any direct knowledge about Alice. It is not hearsay in establishing many facts about Bob at the time of the conversation. E.g. where Bob was at the time of the conversation, whether he was acquainted with Alice, or many other such propositions. It also conveys facts about Charlie. Charlie's statement conveys direct knowledge of many facts about the conversation that are not dependent upon the veracity of Bob's statements, and are therefore not hearsay in relation to them.

I was pretty explicit that scale improves things and eventually surpasses any particular level that you get to earlier with the help of domain knowledge...my point is that you can keep helping it, and it will still be better than it would be with just scale. MuZero is just evidence that scale eventually gets you to the place you already were, because they were trying very hard to get there and it eventually worked.

AlphaZero did use domain insights. Just like AlphaGo. It wasn't self-directed. It was told the rules. It was given a direct way to play games, a... (read more)

1quanticle10mo
No, that's not what domain insights are. Domain insights are just that, insights which are limited to a specific domain. So something like, "Trying to control the center of the board," is a domain insight for chess. Another example of chess-specific domain insights is the large set of pre-computed opening and endgame books that engines like Stockfish are equipped with. These are specific to the domain of chess, and are not applicable to other games, such as Go. An AI that can use more general algorithms, such as tree search, to effectively come up with new domain insights is more general than an AI that has been trained with domain specific insights. AlphaZero is such an AI. The rules of chess are not insights. They are constraints. Insights, in this context, are ideas about which moves one can make within the constraints imposed by the rules in order to reach the objective most efficiently. They are heuristics that allow you to evaluate positions and strategies without having to calculate all the way out to the final move (a task that may be computationally infeasible). AlphaZero did not have any such insights. No one gave AlphaZero any heuristics about how to evaluate board positions. No one told it any tips or tricks about strategies that would make it more likely to end up in a winning position. It figured out everything on its own and did so at a level that was better than similar AIs that had been seeded with those heuristics. That is the true essence of the Bitter Lesson: human insights often make things worse. They slow the AI down. The best way to progress is just to add more scale, add more compute, and let the neural net figure things out on its own within the constraints that it's been given.

The thing is, no one ever presents the actual strongest version of an argument. Their actions are never the best possible, except briefly, accidentally, and in extremely limited circumstances. I can probably remember how to play an ideal version of the tic-tac-toe strategy that's the reason only children play it, but any game more complicated than that and my play will be subpar. Games are much smaller and simpler things than arguments. Simply noticing that an argument isn't the best it could is a you thing, because it is always true. Basically no one is a... (read more)

2JBlack1y
That type of reporting of statements is not considered hearsay because it is directly observed evidence about the defendant, made under oath. It is not treated as evidence that the defendant was in that other city, but as evidence that they said they were. It can be used to challenge the trustworthiness of the defendant's later statements saying that they weren't, for example. The witness can be cross-examined to find flaws in their testimony, other witnesses to the conversation can be brought in, and so on. Hearsay is about things that are reported to the witness. If Alice testifies that Bob said he saw the defendant in the other city, the court could in principle investigate the fact of whether Bob actually said that, but that would be irrelevant. Bob is not on trial, was not under oath, cannot be cross-examined, and so on.
2JBlack1y
It depends upon how strong the argument actually is compared with how strong you would expect it to be if the conclusion were true. It doesn't have to be a perfect argument, but if you have a high prior for the person making the argument to be competent at making arguments (as you would for a trial lawyer, for example) then your expected strength may be quite high and a failure to meet it may be valid evidence toward the conclusion being false. If the person making the argument is a random layperson and you expected them to present a zero-knowledge cryptographic protocol in support, then your model of the world is poorly calibrated. A bad world model can indeed result in wrong conclusions, and that together with bounded rationality (such as failure to apply an update to your world model as well as the target hypothesis) can mean being stuck in bad equilibria. That's not great, and it would be nice to have a model for bounded rationality that does guarantee converging toward truth, but we don't.

Even if they were somehow extremely beneficial normally (which is fairly unlikely), any significant risk of going insane seems much too high. I would posit they have such a risk for exactly the same reason -when using them, you are deliberately routing around very fundamental safety features of your mind.

Donald Hobson has a good point about goodharting, but it's simpler than that. While some people want alignment so that everyone doesn't die, the rest of us still want it for what it can do for us. If I'm prompting a language model with "A small cat went out to explore the world" I want it to come back with a nice children's story about a small cat that went out to explore the world that I can show to just about any child. If I prompt a robot that I want it to "bring me a nice flower" I do not want it to steal my neighbor's rosebushes. And so on, I want it to be safe to give some random AI helper whatever lazy prompt is on my mind and have it improve things by my preferences.

Stop actively looking (though keep your ears open) when you have thoroughly researched two things: 

 First, the core issues that that could change your mind about who you'd think you should vote for. This is not about the candidates themselves.

Then, the candidates or questions on the ballot themselves. For candidates: Are they trustworthy? Where do they fall on the issues important to you? Do they implement these issues properly? Will they get things done? Are there issues they bring up that you would have included?  If so, look at those issu... (read more)

That seems like the wrong take away to me. Why do we change our minds so little?

1.)Some of our positions are just right.

2.)We are wrong, but we don't take the time and effort to understand why we should change our minds.

We don't know which situation we are in beforehand, but if you change your mind less than you think you do, doesn't that mean you think you are often wrong? And that you are wrong about how you respond to it?

You could try to figure out what possible things would get you to change your mind, and look directly at those things, trying to fully understand whether they are or are not the way that would change things.

1Double1y
I have spent many hours on this, and I have to make a decision by two days from now. There's always the possibility that there is more important information to find, but even if I stayed up all night and did nothing else, I would not be able to read the entirety of the websites, news articles, opinion pieces, and social media posts relating to the candidates. Research costs resources! I suppose what I'm asking for is a way of knowing when to stop looking for more information. Otherwise I'll keep trying possibility 2 over and over and end up missing the election deadline!

This is well written, easy to understand, and I largely agree that instilling a value like love for humans in general (as individuals) could deal with an awful lot of failure modes. It does so amongst humans already (though far from perfectly).

When there is a dispute, rather than optimizing over smiles in a lifetime (a proxy of long-term happiness), preferable is obviously something more difficult like, if the versions of the person in both the worlds where it did and did not happen would end up agreeing that it is better to have happened, and that it woul... (read more)

Understanding what parts of an argument you dislike are actually something you can agree with seems like a valuable thing to keep in mind. The post is well written and easy to understand too. I probably won't do this any more than I already do though.

What I try to do isn't so different, just less formal. I usually simply agree or disagree directly on individual points that come up through trying to understand things in general. I do not usually keep in mind what the current score is of agreement or disagreement is, and that seems to help not skew things to... (read more)

3Yulia1y
I don't think you need the agreement-extent game then :) This more formal approach is probably helpful for people like me who tend to go on the offensive in face-to-face interactions. Most of what I wrote is my version of the argument. The two quotes I included are the extent to which Peterson presents his position. The video was on a somewhat different topic, so it's not surprising that he didn't explore it too deeply. He probably has a more elaborate explanation of his position somewhere on the web. I have a similar impression of European politics (though there's probably less polarization than in the US). I agree that it's a fatal flaw!   

The entire thing I wrote is that marrying human insights, tools, etc with the scale increases leads to higher performance, and shouldn't be discarded, not that you can't  do better with a crazy amount of resources than a small amount of resources and human insight.

Much later, with much more advancement, things improve. Two years after the things AlphaGo was famous for happened, they used scale to surpass it, without changing any of the insights. Generating games against itself is not a change to the fundamental approach in a well defined game like Go.... (read more)

4quanticle1y
I'm not sure how any of what you said actually disproves the Bitter Lesson. Maybe AlphaZero isn't the best example of the Bitter Lesson, and MuZero is a better example. So what? Scale caught up eventually, though we may bicker about the exact timing. AlphaZero didn't use any human domain insights. It used a tree search algorithm that's generic across a number of different games. The entire reason that AlphaZero was so impressive when it was released was that it used an algorithm that did not encode any domain specific insights, but was still able to exceed state-of-the-art AI performance across multiple domains (in this case, chess and Go).

I probably should have included that or an explicit definition in what I wrote.

Yes, though I'm obviously arguing against what I think it means in practice, and how is it used, not necessarily how it was originally formulated. I've always thought it was the wrong take on AI history, tacking much too hard toward scale based approaches, and forgetting the successes of other methods could be useful too, as an over-correction from when people made the other mistake.

I think your response shows I understood it pretty well. I used an example that you directly admit is against what the bitter lesson tries to teach as my primary example. I also never said anything about being able to program something directly better.

I pointed out that I used the things people decided to let go of so that I could improve the results massively over the current state of the machine translation for my own uses, and then implied we should do things like give language models dictionaries and information about parts of speech that it can use as... (read more)

In a lot of ways, this is similar to the 'one weird trick' style of marketing so many lampoon. Assuming that you summarized Kuhn and Feyerabend correctly, it looks like: one weird trick to solve all of science, Popper: 'just falsify things'; Kuhn: 'just find Kantian paradigms for the field'; Feyerabend: 'just realize the only patterns are that there aren't any patterns.'

People like this sort of thing because its easy to understand (and I just made the same sort of simplification for this sentence.). Science is hard to get right, and people can only keep in... (read more)

My memories of childhood aren't that precise. I don't really know what my childhood state was? Before certain extremely negative things happened to my psyche, that is. There are only a few scattered pieces I recall, like self-sufficiency and  honesty being important, but these are the parts that already survived into my present political and moral beliefs.

The only thing I could actually use is that I was a much more orderly person when I was 4 or 5, but I don't see how it would work to use just that.

I can't say I'm surprised a utilitarian doesn't realize how vague it sounds? It is a jargon taken from a word that simply means ability to be used widely? Utility is an extreme abstraction, literally unassignable, and entirely based on guessing. You've straightforwardly admitted that it doesn't have an agreed upon basis. Is it happiness? Avoidance of suffering? Fulfillment of the values of agents? Etc. 

Utilitarians constantly talk about monetary situations, because that is one place they can actually use it and get results? But there, it's hardly diff... (read more)

1artifex1y
Yes, to me utilitarian ethical theories do seem usually more interested in formalizing things. That is probably part of their appeal. Moral philosophy is confusing, so people seek to formalize it in the hope of understanding things better (that’s the good reason to do it, at least; often the motivation is instead academic, or signaling, or obfuscation). Consider Tyler Cowen’s review [https://marginalrevolution.com/marginalrevolution/2011/07/on-what-matters-vol-i-review-of-derek-parfit.html] of Derek Parfit’s arguments in On What Matters [https://en.wikipedia.org/wiki/On_What_Matters]:

You probably don't agree with this, but if I understand what you're saying, utilitarians don't really agree on anything or really have shared categories? Since utility is a nearly meaningless word outside of context due to broadness and vagueness, and they don't agree on anything about it, Utilitarianism shouldn't really be considered a thing itself? Just a collection of people who don't really fit into the other paradigms but don't rely on pure intuitions. Or in other words, pre-paradigmatic?

1artifex1y
I don’t see “utility” or “utilitarianism” as meaningless or nearly meaningless words. “Utility” often refers to von Neumann–Morgenstern utilities and always refers to some kind of value assigned to something by some agent from some perspective that they have some reason to find sufficiently interesting to think about. And most ethical theories don’t seem utilitarian, even if perhaps it would be possible to frame them in utilitarian terms.

The easy first step, is a simple bias toward inaction, which you can provide with a large punishment per output of any kind. For instance, a language model with this bias would write out something extremely likely, and then stop quickly thereafter. This is only a partial measure, of course, but it is a significant first step.

Second through n-th step, harder, I really don't even know, how do you figure out what values to try to train it with to reduce impact. The immediate things I can think  of might also train deceit, so it would take some thought.

Al... (read more)

You seem to have straw-manned your ideological opponents. Your claims are neither factually accurate, nor charitable. They don't point you in a useful direction either. Obviously, conservatives can be very wrong, but your assumptions seem unjustified.

And what are you going to claim your opponents do? In the US, rightists claim they will lower taxes, which they do. They claim they will reduce regulation, which they do. They claim they will turn back whatever they deem the latest outrage...which has mixed results. They explain why all of this is good by refe... (read more)

I had a thought while reading the section on Goodharting. You could fix a lot of the potential issues with an agentic AI by training it to want its impact on the world to be within small bounds. Give it a strong and ongoing bias toward 'leave the world the way I found it.' This could only be overcome by a very clear and large benefit toward its other goals per small amount of change. It should not be just part of the decision making process though, but part of the goal state. This wouldn't solve every possible issue, but it would solve a lot of them. In other words, make it unambitious and conservative, and then its interventions will be limited and precise if it has a good model of the world.

3Thomas Larsen1y
I think this is part of the idea behind Eliezer-corrigibility, and I agree that if executed correctly, this would be helpful.   The difficulty with this approach that I see is:  1) how do you precisely specifying what you mean by "impact on the world to be within small bounds" -- this seems to require a measure of impact. This would be amazing (and is related to value extrapolation [https://www.lesswrong.com/posts/i8sHdLyGQeBTGwTqq/value-extrapolation-concept-extrapolation-model-splintering]).  2) how do you induce the inner value of "being low impact" into the agent, and make sure that this generalizes in the intended way off of the training distribution.  (these roughly correspond to the inner and outer alignment problem)  3) being low impact is strongly counter to the "core of consequentialism":  convergent instrumental goals for pretty much any agent cause it to seek power. 

The obvious thing to keep in mind is that people dislike 'inauthentic' politicians but there are many things the people want in politicians. If a politician wants to portray a certain image to match what people like, they have to pick the policy positions that go along with it to avoid said label. Said images are not so evenly or finely distributed as policy positions, so that tends to lead to significant differences along these axes.

Then, over time, the parties accrete their own new images too, and people have to position themselves in relation to those too...

Load More