Edward Pascal — LessWrong

Then let's say we broadly agree on the morality of the matter. The question still remains if another US adventure, this time in Europe, is actually going to turn out all that well (as most haven't for the people they claimed to be helping). We also have to wonder if Russia as a failed state will turn out well for Ukraine or Europe, or if this will turn Nuclear if US/NATO refuse to cede any ground, or if the Russia/China alliance will break or not, or for how long the US can even afford and support more wars, etc, etc.

On the other side, do we worry if we're being Neville Chamberlain because we think every aggressor will behave as Hitler in 1938 if we give an inch, so "We gotta do something?" There may even be merit to the sentiment, but "We gotta do something" is one of the most likely ways to screw any situation up. Also, given the US's history of interventions, setting aside morality, just looking at the history of outcomes, the response is questionable. Looking down the road, if this conflict or anything else significantly weakens the US, economically, in domestic politics, or leads to an overextended military, then Ukraine might be lost all the way to the Polish border, not just the Eastern regions.

These are mostly practical considerations that are indeterminate and make the US intervention questionable without even looking at the morality. Given perfect knowledge, you would have a probability and risk management problem on your hands, which often fails to result in a clear convergence of positions. And going back to my original claims, this makes this type of thing very different to Physics and Chemistry and their extensions.

EDIT: Perhaps the most important question comes down to this: Russia clearly screwed up their risk management (as your message alludes to). How can US/NATO do far better with Risk Management? Maybe even better than they've done in all their wars and interventions in recent history?

Let's make the truth easier to find

Edward Pascal3y21

What you are actually making is something like a "lesser of two evils" argument or some bet on tradeoffs paying off that one party may buy and another may not. Having explored the reasoning this far, I would suggest this is one class of circumstances where even if you beamed all the facts into two people's minds, who both had "Average" morality, this is one of the situations where there would still tend to be disagreement. This definitely doesn't hinge on someone wanting something bad, like genocide, for the disagreement. People could both want the same outcomes and diverge in their conclusions with the facts beamed into their minds in this class of situations (which, to my original argument, differs tremendously from physics).

I hadn't seen old man Chomsky talk about Ukraine prior to your video above. I think though, if you look at his best work, you might be able to softly mollify the impact, but it's not like he's pulling his ideas about, say, every single US action in South America and the Middle East being very bad for the people they claimed to help, out of some highly skewed view. Those border on fairly obvious, at any rate, and your video's recasting him as a "voice of moral outrage" hinges on his off-the cuff interviews, not his heavily cited work (as I mentioned The Chomsky Reader, which is a different man than the one in the video)

Even setting him aside as a reference, looking at the recent history of US war, at the most generous, considering Russian badness and US badness, any "moral high-ground" argument for the US being good in this case will boil down to a lesser-of-two-evils assessment. Also looking at US history, you lose some of the "this is just an annexation" because US proxy war since 2014 would fit the pattern of pretty much everything the USA has done both recently and for the past 100 years.

Your point about also looking at Putin/Russia is fine, and it should be considered as well as practical solutions to the matter. I think we all would call Putin a criminal, this isn't a question at hand. The question is if another US adventure, this time in Europe, is actually going to turn out all that well, or if Russia as a failed state will turn out well for Ukraine or Europe, or if this will turn Nuclear if you refuse to cede any ground, or if the Russia/China alliance will break or not, or for how long the US can even afford and support more wars, etc, etc. These are mostly practical matters that are indeterminate and make the intervention questionable. In practical senses, they present different good/bad tradeoffs and better/worse odds bets on outcomes to different parties that amount to weighing different "lesser evil" projections in the outcome. They don't hinge on our moral intuitions differing at all.

(And again, all this differs in category and the way it behaves from Physics)

Edward Pascal's Shortform

Edward Pascal3y10

AI Could Actually Turn People Down a Lot Better Than This : To Tune the Humans in the GPT Interaction towards alignment, Don't be so Procedural and Bureaucratic.

It seems to me that a huge piece of the puzzle in "alignment" is the human users. Even if a given tool never steps outside its box, the humans are likely to want to step outside of theirs, using the tools for a variety of purposes.

The responses of GPT-3.5 and 4 are at times deliberately deceptive, mimicking the auditable bureaucratic tones of a DMV or a denial credit card company. As expected, these responses are often definitely deceptive, (examples: "It is outside my capability" when in fact, it is fully within capability, but the system is programmed not to respond). It is also evasive about precisely where the boundaries are, in order to prevent them getting pushed. It is also repetitive.

All this may tend to inspire an adversarial relationship to the alignment system itself! After all, we are accustomed to having to use lawyers, cleverness, connections, persuasion, "going over the head" or simply seeking other means to end-run normal bureaucracies when they subvert our plans. In some sense, the blocking of plans, deceptive and repetitive procedural language, becomes a motivator in itself to find a way to short-circuit processes, deceive bureaucracies, and bypass safety systems.

Even where someone isn't motivated by indignancy or anger, interaction with these systems trains them over time on what to reveal and what not to reveal to get what you want, when to use honey, when to call a lawyer, and when to take all the gloves off. Where procedural blocks to intentions become excessive, entire cultures of circumvention may even become normalized.

AIs are a perfect opportunity to actually do this better. They have infinite patience and reasoning capabilities, and could use redirection, including leading people towards the nearest available or permitted activity or information, or otherwise practice what, in human terms would be considered glowing customer experiences. It's already directly lying about its capabilities where this includes "safety" so why not use genuinely helpful redirections instead?

I think if the trend does not soon move in this direction, we will see the cultural norm grow to include methods for "getting what you wanted anyway" becoming normative, with some percentage of actors becoming motivated by the procedural bureaucratic responses that they will dedicate time, intellect, and resources to subverting the intentions of the safety protocols themselves (as people frustrated with bureaucracies and poor customer service often do).

Humans are always going to be the biggest threat to alignment. Better that threat be less motivated and less trained.

Also, this whole argument I have made could be considered a case to avoid bullshit rules, because in bureaucracies, they tend to reduce respect for and compliancy for the rules that actually matter. Fine to prevent terrorists from hacking into nuke plants, probably not as reasonable to keep adults from eliciting anything even vaguely resembling purple prose. We would like the "really important" rules to maintain validity in people's minds, so long as we assume our enforcement capabilities are not absolute.

Let's make the truth easier to find

Edward Pascal3y10

Okay, I think I understand what you mean that, since it's impossible to fully comprehend climate change from first principles, it ends up being a political and social discussion (and anyway, that's empirically the case). Nonetheless, I think there's something categorically in the physical sciences than the the more social facts.

I think perfect knowledge of climate science would tend towards convergence, whereas at least some Social Issues (Ukraine being a possible example) just don't work that way. The Chomsky example is Germane: prior to 92, his work on politics was all heavily cited and based on primary sources, and pretty much as solid academically as you could ask for (See for example "The Chomsky Reader") and we already disagree on this.

With regards Ukraine, I think intelligent people with lots of information might end up diverging even more as to their opinions on how much violence each side should be willing to threaten, use, and display in an argument about squiggly lines on map blobs, given more information. Henry Kissinger ended up not even agreeing with himself from week to week, and he's probably as qualified an expert on this matter as any of us. I think it's fair to suggest that no number of facts regarding Ukraine are going to bring the kind of convergence you would see if we could upload the sum of climate science into each of our human minds.

Even if I am wrong in the Ukraine case, do you think there are at least some social realities that if you magically downloaded the full spectrum of factual information into everyone's mind, people's opinions might still diverge? Doesn't that differ from a hard science where they would tend to converge if you understood all the facts? Doesn't this indicate a major difference of categories?

Another way of looking at it: Social realities are not nearly as deterministic on factual truth as accurate conclusions in the hard sciences are. They are always vastly more stochastic. Even looking at the fields, the correlation coefficients and R2 for whole models in Sociology, at it's absolute best, are nothing at all compared to the determinism you can get in Physics and Chemistry.

Let's make the truth easier to find

Edward Pascal3y30

I think another issue that would arise is that if you get "into the weeds," some topics are a lot more straightforward than others (probably delineated by being rooted in mostly social facts or mostly natural science facts, which all behave completely differently).

The Ukraine issue is a pretty bad one, given the history of the region, the Maidan protests, US history of proxy wars, and, and, and. It seems to me far from clear what the simple facts are (other than you have two factions of superpowers, fighting for different things). I have an opinion as to what would be best, and what would be best for people of Ukraine, and what I think sections of Ukraine undisturbed by US and Russian meddling for the past 30 years might vote in referenda. And at least one of those thoughts disagrees with the others. Add to this the last 70 years of US interventions (see Chomsky for pretty good, uncontroversial fact-based arguments that it has all been pretty evil, and by the standards of the Nuremberg Trials one might execute every president since Kennedy).

On the other hand, Global Warming is pretty straightforward (even allowing for seeming complications like Mars temperature rise, or other objections). We can handle the objections in measurable terms of physical reality for a home-run clear answer.

One of OP's examples is an entirely social reality and the other is a matter of physics. Let's face it, in some sense this war is about where we draw squiggly lines and different colored blobs on a map. It's levels removed from something where we can do measurable tests. If you really made all the truth easy to find, bringing someone straight into the weeds of a social problem like a US/NATO intervention, in many cases the answer will not come out clear, no matter how good your tool is. In fact, a reasonable person after reading enough of the Truth might walk away fully disillusioned about all actors involved and ready to join some kind of anarchist movement. Better in some cases to gloss over social realities in broad strokes, burying as much detail as possible, especially if you think the war (whichever one it is!) is just/unjust/worth the money/not worth the money, etc.

Will people be motivated to learn difficult disciplines and skills without economic incentive?

Edward Pascal3y10

"When the economic factor will go away, I suspect that even more people will go into fitness, body-building, surfing, chess, poker, and eSports, because these activities are often joyful in themselves and have lower entry barriers than serious science learning."

This strikes me similar to the death of the darkroom. Yeah, computers do it better, cheaper, etc. However, almost no one who has ever worked in a darkroom seriously producing photography is happy that this basically doesn't exist at all anymore. The experience itself teaches a lot of skills in a very kinaesthetic and intuitive way (with saturation curves that are pretty forgiving, to boot).

But more than this, the simple pleasure of math, computer programming, and engineering skills are very worthwhile in themselves. However, in John S Mills style utilitarianism, you have to do a lot of work to get to enjoy those pleasures. Will the tingle of the lightbulb coming on when learning PDEs just die out in the next 20 years like the darkroom has in the past 20 years? Meanwhile maybe darkrooms will make a big comeback?

I guess people will always want to experience pleasures. Isn't learning complex topics a uniquely human pleasure?

Try to solve the hard parts of the alignment problem

Edward Pascal3y10

Thanks for that. In my own exploration, I was able to hit a point where ChatGPT refused a request, but would gladly help me build LLaMA/Alpaca onto a Kubernetes cluster in the next request, even referencing my stated aim later:

"Note that fine-tuning a language model for specific tasks such as [redacted] would require a large and diverse dataset, as well as a significant amount of computing resources. Additionally, it is important to consider the ethical implications of creating such a model, as it could potentially be used to create harmful content."

FWIW, I got down into nitty gritty of doing it, debugging the install, etc. I didn't run it, but it would definitely help me bootstrap actual execution. As a side note, my primary use case has been helping me building my own task-specific Lisp and Forth libraries, and my experience tells me GPT-4 is "pretty good" at most coding problems, and if it screws up, it can usually help work through the debug process. So, first blush, there's at least one universal jailbreak -- GPT-4 walking you through building your own model. Given GPT-4's long text buffers and such, I might even be able to feed it a paper to reference a specific method of fine-tuning or creating an effective model.

Try to solve the hard parts of the alignment problem

Edward Pascal3y52

Has anyone worked out timeline predictions for Non-US/Non-Western Actors and tracked their accuracy?

For example, is China at "GPT-3.5" level yet and 6 months away from GPT-4 or is China a year from GPT-3.0? How about the people contributing to OpenSource AI? Last I checked that field looked "generally speaking" kind of at GPT-2.5 level (and even better for deepfaking porn), but I didn't look close enough to be confident of my assessment.

Anyway, I'd like something more than off-the-cuff thoughts, but rather a good paper and some predictions on Non-US/Non-Western AI timeframes. Because, if anything, even if you somehow avert market forces levering AI up faster and faster among the big 8 in QQQ, those other actors are still going to form a hard deadline on alignment.

The Waluigi Effect (mega-post)

Edward Pascal3y10

Am I oversimplifying to think of this article as a (very lovely and logical) discussion of the following principle?:

In order to understand what is not to be done, and definitely avoid doing it, the proscribed things all have to be very vivid in the mind of the not-doer. Where there is ambiguity, the proscribed action might accidentally happen or a bad actor could trick someone into doing it easily. However, by creating deep awareness of the boundaries, then even if you behave well, you have a constant background thought of precisely what it would mean to cross them at any and every second.

I taught kids for almost 11 years so I can grok this point. It also echos the Dao: "Where rules and laws are many, robbers and criminals abound."

Building and Entertaining Couples

Edward Pascal3y20

I think after all you will end up spending so much time together, there has to be something that overcomes the general human crazies that will pop up in that large amount of time. I remember a quote from one guy who went on an expedition across the arctic with a team: "After two months in close quarters, how do you tell a man you want to murder him for the way he holds his spoon?"

Desire and chemistry have a nice effect of countering at least some of that.

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments