Building gears-level models is expensive - often prohibitively expensive. Black-box approaches are usually cheaper and faster. But black-box approaches rarely generalize - they need to be rebuilt when conditions change, don’t identify unknown unknowns, and are hard to build on top of. Gears-level models, on the other hand, offer permanent, generalizable knowledge which can be applied to many problems in the future, even if conditions shift.
I've been in many conversations where I've mentioned the idea of using neuroscience for outer alignment, and the people who I'm talking to usually seem pretty confused about why I would want to do that. Well, I'm confused about why one wouldn't want to do that, and in this post I explain why.
As far as I see it, there are three main strategies people have for trying to deal with AI alignment in worlds where AI alignment is hard.
In my opinion, these are all great efforts, but I personally like the idea of working on value alignment directly. Why? First some negatives of the others:
...First is that I don't really expect us to come up with a fully general answer to this problem in time. I wouldn't be surprised if we had to trade off some generality for indexing on the system in front of us - this gets us some degree of non-robustness, but hopefully enough to buy us a lot more time before stuff like the problem behind deep deception breaks a lack of True Names. Hopefully then we can get the AI systems to solve the harder problem for us in the time we've bought, with systems more powerful than us. The relevance here is that if this is the
S-risks are barely discussed in LW, is that because:
Is the era of AI agents writing complex code systems without humans in the loop upon us?
Cognition is calling Devin ‘the first AI software engineer.’
Here is a two minute demo of Devin benchmarking LLM performance.
Devin has its own web browser, which it uses to pull up documentation.
Devin has its own code editor.
Devin has its own command line.
Devin uses debugging print statements and uses the log to fix bugs.
Devin builds and deploys entire stylized websites without even being directly asked.
What could possibly go wrong? Install this on your computer today.
Padme.
I would by default assume all demos were supremely cherry-picked. My only disagreement with Austen Allred’s statement here is that this rule is not new:
...Austen Allred: New rule:
If someone only shows their AI model in tightly
(Crossposted by habryka after asking Eliezer whether I could post it under his account)
"Ignore all these elaborate, abstract, theoretical predictions," the Spokesperson for Ponzi Pyramid Incorporated said in a firm, reassuring tone. "Empirically, everyone who's invested in Bernie Bankman has received back 144% of what they invested two years later."
"That's not how 'empiricism' works," said the Epistemologist. "You're still making the assumption that --"
"You could only believe that something different would happen in the future, if you believed in elaborate theoretical analyses of Bernie Bankman's unobservable internal motives and internal finances," said the spokesperson for Ponzi Pyramid Incorporated. "If you are a virtuous skeptic who doesn't trust in overcomplicated arguments, you'll believe that future investments will also pay back 144%, just like in the past. That's the...
Apparently Eliezer decided to not take the time to read e.g. Quintin Pope's actual critiques, but he does have time to write a long chain of strawmen and smears-by-analogy.
A lot of Quintin Pope's critiques are just obviously wrong and lots of commenters were offering to help correct them. In such a case, it seems legitimate to me for a busy person to request that Quintin sorts out the problems together with the commenters before spending time on it. Even from the perspective of correcting and informing Eliezer, people can more effectively be corrected a...
Say you want to plot some data. You could just plot it by itself:
Or you could put lines on the left and bottom:
Or you could put lines everywhere:
Or you could be weird:
Which is right? Many people treat this as an aesthetic choice. But I’d like to suggest an unambiguous rule.
First, try to accept that all axis lines are optional. I promise that readers will recognize a plot even without lines around it.
So consider these plots:
Which is better? I claim this depends on what you’re plotting. To answer, mentally picture these arrows:
Now, ask yourself, are the lengths of these arrows meaningful? When you draw that horizontal line, you invite people to compare those lengths.
You use the same principle for deciding if you should draw a y-axis line. As...
My first impression was also that axis lines are a matter of aesthetics. But then I browsed The Economist's visual styleguide and realized they also do something similar, i.e. omit the y-axis line (in fact, they omit the y-axis line on basically all their line / scatter plots, but almost always maintain the gridlines).
Here's also an article they ran about their errors in data visualization, albeit probably fairly introductory for the median LW reader.
Churchill famously called democracy “the worst form of Government except for all those other forms that have been tried from time to time” - referring presumably to the relative success of his native Britain, the US, and more generally Western Europe and today most of the first world.
I claim that Churchill was importantly wrong. Not (necessarily) wrong about the relative success of Britain/US/etc, but about those countries’ governments being well-described as simple democracy. Rather, I claim, the formula which has worked well in e.g. Britain and the US diverges from pure democracy in a crucial load-bearing way; that formula works better than pure democracy both in theory and in practice, and when thinking about good governance structures we should emulate the full formula rather than pure democracy.
Specifically, the actual...
I think this is an interesting point of view. The OP is interested in how this concept of checked democracy might work within a corporation. From a position of ignorance can I ask whether anyone familiar with German corporate governance recognises this mode of democracy within German organisations? I choose Germany because large German companies historically incorporate significant worker representation within their governance structures, and, historically, tend to perform well.
If it’s worth saying, but not worth its own post, here's a place to put it.
If you are new to LessWrong, here's the place to introduce yourself. Personal stories, anecdotes, or just general comments on how you found us and what you hope to get from the site and community are invited. This is also the place to discuss feature requests and other ideas you have for the site, if you don't want to write a full top-level post.
If you're new to the community, you can start reading the Highlights from the Sequences, a collection of posts about the core ideas of LessWrong.
If you want to explore the community more, I recommend reading the Library, checking recent Curated posts, seeing if there are any meetups in your area, and checking out the Getting Started section of the LessWrong FAQ. If you want to orient to the content on the site, you can also check out the Concepts section.
The Open Thread tag is here. The Open Thread sequence is here.
Post upvotes are at the bottom but user comment upvotes are at the top of each comment. Sometimes I'll read a very long comment and then have to scroll aaaaall the way back up to upvote it. Is there some reason for this that I'm missing or is it just an oversight?
I have anxiety and depression.
The kind that doesn’t go away, and you take pills to manage.
This is not a secret.
What’s more interesting is that I just switched medications from one that successfully managed the depression but not the anxiety to one that successfully manages the anxiety but not the depression, giving me a brief window to see my two comorbid conditions separated from each other, for the first time since ever.
What follows is a (brief) digression on what they’re like from the inside.
I’m still me when I’m depressed.
Just a version of me that’s sapped of all initiative, energy, and tolerance for human contact.
There are plenty of metaphors for depression - a grey fog being one of the most popular - but I often think of it in...
Just me following up with myself wrt what the post made me think about: it’s as if there are two ways of being anxious, one where you feel sort of frazzled and hectic all the time (‘I need to do more of that stuff, and do it better, or something bad will happen’), and one where you just retreat to safety (‘There’s nothing I can do that wouldn’t come with an exceedingly high risk of something bad happening’). It’s quite clear that the former could lead someone to being an overachiever and doing masses of great stuff (while still, unfortunately, feeling like...