Building gears-level models is expensive - often prohibitively expensive. Black-box approaches are usually cheaper and faster. But black-box approaches rarely generalize - they need to be rebuilt when conditions change, don’t identify unknown unknowns, and are hard to build on top of. Gears-level models, on the other hand, offer permanent, generalizable knowledge which can be applied to many problems in the future, even if conditions shift.
I've been in many conversations where I've mentioned the idea of using neuroscience for outer alignment, and the people who I'm talking to usually seem pretty confused about why I would want to do that. Well, I'm confused about why one wouldn't want to do that, and in this post I explain why.
As far as I see it, there are three main strategies people have for trying to deal with AI alignment in worlds where AI alignment is hard.
In my opinion, these are all great efforts, but I personally like the idea of working on value alignment directly. Why? First some negatives of the others:
...When I think of how I approach solving this problem in practice, I think of interfacing with structures within ML systems that satisfy an increasing list of desiderata for values, covering the rest with standard mech interp techniques, and then steering them with human preferences. I certainly think it's probable that there are valuable insights to this process from neuroscience, but I don't think a good solution to this problem (under the constraints I mention above) requires that it be general to the human brain as well. We steer the system with our pref
(Crossposted by habryka after asking Eliezer whether I could post it under his account)
"Ignore all these elaborate, abstract, theoretical predictions," the Spokesperson for Ponzi Pyramid Incorporated said in a firm, reassuring tone. "Empirically, everyone who's invested in Bernie Bankman has received back 144% of what they invested two years later."
"That's not how 'empiricism' works," said the Epistemologist. "You're still making the assumption that --"
"You could only believe that something different would happen in the future, if you believed in elaborate theoretical analyses of Bernie Bankman's unobservable internal motives and internal finances," said the spokesperson for Ponzi Pyramid Incorporated. "If you are a virtuous skeptic who doesn't trust in overcomplicated arguments, you'll believe that future investments will also pay back 144%, just like in the past. That's the...
Apparently Eliezer decided to not take the time to read e.g. Quintin Pope's actual critiques, but he does have time to write a long chain of strawmen and smears-by-analogy.
A lot of Quintin Pope's critiques are just obviously wrong and lots of commenters were offering to help correct them. In such a case, it seems legitimate to me for a busy person to request that Quintin sorts out the problems together with the commenters before spending time on it. Even from the perspective of correcting and informing Eliezer, people can more effectively be corrected a...
Say you want to plot some data. You could just plot it by itself:
Or you could put lines on the left and bottom:
Or you could put lines everywhere:
Or you could be weird:
Which is right? Many people treat this as an aesthetic choice. But I’d like to suggest an unambiguous rule.
First, try to accept that all axis lines are optional. I promise that readers will recognize a plot even without lines around it.
So consider these plots:
Which is better? I claim this depends on what you’re plotting. To answer, mentally picture these arrows:
Now, ask yourself, are the lengths of these arrows meaningful? When you draw that horizontal line, you invite people to compare those lengths.
You use the same principle for deciding if you should draw a y-axis line. As...
My first impression was also that axis lines are a matter of aesthetics. But then I browsed The Economist's visual styleguide and realized they also do something similar, i.e. omit the y-axis line (in fact, they omit the y-axis line on basically all their line / scatter plots, but almost always maintain the gridlines).
Here's also an article they ran about their errors in data visualization, albeit probably fairly introductory for the median LW reader.
Churchill famously called democracy “the worst form of Government except for all those other forms that have been tried from time to time” - referring presumably to the relative success of his native Britain, the US, and more generally Western Europe and today most of the first world.
I claim that Churchill was importantly wrong. Not (necessarily) wrong about the relative success of Britain/US/etc, but about those countries’ governments being well-described as simple democracy. Rather, I claim, the formula which has worked well in e.g. Britain and the US diverges from pure democracy in a crucial load-bearing way; that formula works better than pure democracy both in theory and in practice, and when thinking about good governance structures we should emulate the full formula rather than pure democracy.
Specifically, the actual...
I think this is an interesting point of view. The OP is interested in how this concept of checked democracy might work within a corporation. From a position of ignorance can I ask whether anyone familiar with German corporate governance recognises this mode of democracy within German organisations? I choose Germany because large German companies historically incorporate significant worker representation within their governance structures, and, historically, tend to perform well.
If it’s worth saying, but not worth its own post, here's a place to put it.
If you are new to LessWrong, here's the place to introduce yourself. Personal stories, anecdotes, or just general comments on how you found us and what you hope to get from the site and community are invited. This is also the place to discuss feature requests and other ideas you have for the site, if you don't want to write a full top-level post.
If you're new to the community, you can start reading the Highlights from the Sequences, a collection of posts about the core ideas of LessWrong.
If you want to explore the community more, I recommend reading the Library, checking recent Curated posts, seeing if there are any meetups in your area, and checking out the Getting Started section of the LessWrong FAQ. If you want to orient to the content on the site, you can also check out the Concepts section.
The Open Thread tag is here. The Open Thread sequence is here.
Post upvotes are at the bottom but user comment upvotes are at the top of each comment. Sometimes I'll read a very long comment and then have to scroll aaaaall the way back up to upvote it. Is there some reason for this that I'm missing or is it just an oversight?
I have anxiety and depression.
The kind that doesn’t go away, and you take pills to manage.
This is not a secret.
What’s more interesting is that I just switched medications from one that successfully managed the depression but not the anxiety to one that successfully manages the anxiety but not the depression, giving me a brief window to see my two comorbid conditions separated from each other, for the first time since ever.
What follows is a (brief) digression on what they’re like from the inside.
I’m still me when I’m depressed.
Just a version of me that’s sapped of all initiative, energy, and tolerance for human contact.
There are plenty of metaphors for depression - a grey fog being one of the most popular - but I often think of it in...
Just me following up with myself wrt what the post made me think about: it’s as if there are two ways of being anxious, one where you feel sort of frazzled and hectic all the time (‘I need to do more of that stuff, and do it better, or something bad will happen’), and one where you just retreat to safety (‘There’s nothing I can do that wouldn’t come with an exceedingly high risk of something bad happening’). It’s quite clear that the former could lead someone to being an overachiever and doing masses of great stuff (while still, unfortunately, feeling like...
With the end of the world nigh, and a public panic about to start, this seems an ideal time to worry about weight loss and the obesity epidemic.
Coincidentally, for the first time in my life, I'm getting fat.
SlimeMoldTimeMold's 'Chemical Hunger' series
https://slimemoldtimemold.com/2021/07/07/a-chemical-hunger-part-i-mysteries/
seemed to draw a lot of interest round these parts, and even if it's not lithium
https://www.lesswrong.com/posts/7iAABhWpcGeP5e6SB/it-s-probably-not-lithium
it does seem to me that the molds raise some most interesting questions.
I find the whole 'seed oil' craziness to be a compellingly interesting argument, although, as Scott Alexander wrote:
https://slatestarcodex.com/2020/03/10/for-then-against-high-saturated-fat-diets/
it does seem to be flat wrong. But I think it's important to be interested in ideas that look like they have to be right but aren't.
I want to draw everyone's attention to the 'Experimental Fat Loss' substack
https://exfatloss.substack.com
Which seems to me the very...
You may be interested in the what Tucker Goodrich is doing, he's been reviewing the literature, and it's probably the Linoleic acid. He's pointed at the research on the direct stimulation of the endocanabinoid system by omega6. He's interviewed someone who studied Tributyl tin, an obesogen present at relevant doses in all of our environments, it also happens to agonize the same receptors omega6s do, and also has canabinoid activity.
Imagine trying to lose weight while smoking weed all day every day.
Lex Fridman posts timestamped transcripts of his interviews. It's an 83 minute read here and a 115 minute watch on Youtube.
It's neat to see Altman's side of the story. I don't know whether his charisma is more like +2SD or +5SD above the average American (concept origin: planecrash, likely doesn't follow a normal distribution), and I only have a vague grasp of what kinds of shenanigans +5SDish types can do when they pull out the stops in face-to-face interactions, so maybe you'll prefer to read the transcript over watching the video (although they're largely related to reading and responding to your facial expression and body language on the fly, not projecting their own).
If you've missed it, Gwern's side of the story is here.
...Lex Fridman(00:01:05) Take me through
Ah, neat, thanks! I had never heard of that paper or the Conger-Kanungo scale, when I referred to charisma I intended it in the planecrash sense of charisma that's focused on social dominance and subterfuge, rather than business management which is focused on leadership and maintaining the status quo which means something completely different and which I had never heard of.
I'm looking for computer games that involve strategy, resource management, hidden information, and management of "value of information" (i.e. figuring out when to explore or exploit), which:
This is for my broader project of "have a battery of exercises that train/test people's general reasoning on openended problems." Each exercise should ideally be pretty different from the other ones.
In this case, I don't expect anyone to have such a game that they have beaten on their first try, but, I'm looking for games where this seems at least plausible, if you were taking a long time to think each turn, or pausing a lot.
The strategy/resource/value-of-information aspect is meant to correspond to some real world difficulties of running longterm ambitious planning.
(One example game that's been given to me in this category is "Luck Be a Landlord")
Some concepts that I use:
Randomness is when the game tree branches according to some probability distribution specified by the rules of the game. Examples: rolling a die; cutting a deck at a random card.
Slay the Spire has randomness; Chess doesn't.
Hidden Information is when some variable that you can't directly observe influences the evolution of the game. Examples: a card in an opponent's hand, which they can see but you can't; the 3 solution cards set aside at the start of a game of Clue; the winning pattern in a game of Mastermind.
Peop...