Gear-level models are expensive - often prohibitively expensive. Black-box approaches are usually much cheaper and faster. But black-box approaches rarely generalize - they're subject to Goodhart, need to be rebuilt when conditions change, don't identify unknown unknowns, and are hard to build on top of. Gears-level models, on the other hand, offer permanent, generalizable knowledge which can be applied to many problems in the future, even if conditions shift.
The goal of instrumental rationality mostly speaks for itself. Some commenters have wondered, on the other hand, why rationalists care about truth. Which invites a few different answers, depending on who you ask; and these different answers have differing characters, which can shape the search for truth in different ways.
You might hold the view that pursuing truth is inherently noble, important, and worthwhile. In which case your priorities will be determined by your ideals about which truths are most important, or about when truthseeking is most virtuous.
This motivation tends to have a moral character to it. If you think it your duty to look behind the curtain, you are a lot more likely to believe that someone else should look behind the curtain too, or castigate them...
Science was noticing that certain modes of thinking uncovered beliefs that let us manipulate the world
Science is our tool to manipulate the world. Science is the instrument of truth. If you can not manipulate the meaning of reality through definition of words such as 'rational', what it is and it isn't. Where science is the substitute for truth and rationality is the substitute for truth also. As self-evident the definition of rationality is substituted for science and this forms our basic definition for rationality moving forward. Care not to define scien...
Epistemic Status: Reference.
Expanded From: Against Facebook, as the post originally intended.
Some things are fundamentally Out to Get You.
They seek resources at your expense. Fees are hidden. Extra options are foisted upon you. Things are made intentionally worse, forcing you to pay to make it less worse. Least bad deals require careful search. Experiences are not as advertised. What you want is buried underneath stuff you don’t want. Everything is data to sell you something, rather than an opportunity to help you.
When you deal with Out to Get You, you know it in your gut. Your brain cannot relax. You lookout for tricks and traps. Everything is a scheme.
They want you not to notice. To blind you from the truth. You can feel it when you go...
Great post, important concepts. Sharing it everywhere.
There was one piece, though, that I couldn't intuitively grasp, so maybe one of you could help me understand: What is it about video games that are out to get you? ("So do video games.") Elsewhere, Zvi speaks about F2P games, is it about this and their addiction-inducing skinner boxes? If it's about video games in general, I would love to learn how they are more out to get us than, say, novels.
Well I asked this https://www.lesswrong.com/posts/X9Z9vdG7kEFTBkA6h/what-could-a-policy-banning-agi-look-like but roughly no one was interested--I had to learn about "born secret" https://en.wikipedia.org/wiki/Born_secret from Eric Weinstein in a youtube video.
FYI, while restricting compute manufacture is I would guess net helpful, it's far from a solution. People can make plenty of conceptual progress given current levels of compute https://www.lesswrong.com/posts/sTDfraZab47KiRMmT/views-on-when-agi-comes-and-on-strategy-to-reduce . It's not a way out, ei...
The forum has been very much focused on AI safety for some time now, thought I'd post something different for a change. Privilege.
Here I define Privilege as an advantage over others that is invisible to the beholder. This may not be the only definition, or the central definition, or not how you see it, but that's the definition I use for the purposes of this post. I also do not mean it in the culture-war sense as a way to undercut others as in "check your privilege". My point is that we all have some privileges [we are not aware of], and also that nearly each one has a flip side.
In some way this is the inverse of The Lens That Does Not See Its Flaws: The...
I grew up knowing "privilege" to mean a special right that was granted to you based on your job/role (like free food for those who work at some restaurants) or perhaps granted by authorities due to good behavior (and would be taken away for misusing it). Note also that the word itself, "privi"-"lege", means "private law": a law that applies to you in particular.
Rights and laws are social things, defined by how others treat you. To say that your physical health is a privilege therefore seems like either a category error, or a claim that other pe...
Summary:
We think a lot about aligning AGI with human values. I think it’s more likely that we’ll try to make the first AGIs do something else. This might intuitively be described as trying to make instruction-following (IF) or do-what-I-mean-and-check (DWIMAC) be the central goal of the AGI we design. Adopting this goal target seems to improve the odds of success of any technical alignment approach. This goal target avoids the hard problem of specifying human values in an adequately precise and stable way, and substantially helps with goal misspecification and deception by allowing one to treat the AGI as a collaborator in keeping it aligned as it becomes smarter and takes on more complex tasks.
This is similar but distinct from the goal targets of prosaic alignment efforts....
That's true, they are different. But search still provides the closest historical analogue (maybe employees/suppliers provide another). Historical analogues have the benefit of being empirical and grounded, so I prefer them over (or with) pure reasoning or judgement.
A couple of weeks ago three European economists published this paper studying the female income penalty after childbirth. The surprising headline result: there is no penalty.
The paper uses Danish data that tracks IVF treatments as well as a bunch of demographic factors and economic outcomes over 25 years. Lundborg et al identify the causal effect of childbirth on female income using the success or failure of the first attempt at IVF as an instrument for fertility.
What does that mean? We can’t just compare women with children to those without them because having children is a choice that’s correlated with all of the outcomes we care about. So sorting out two groups of women based on observed fertility will also sort them based on income and...
Yes, that was my first guess as well. Increased income from employment is most strongly associated with major changes, such as promotion to a new position with changed (and usually increased) responsibilities, or leaving one job and starting work somewhere else that pays more.
It seems plausible that these are not the sorts of changes that women are likely to seek out at the same rate when planning to devote a lot of time in the very near future to being a first-time parent. Some may, but all? Seems unlikely. Men seem more likely to continue to pursue such opportunities at a similar rate due to gender differences in child-rearing roles.
I don't know the answer, but it would be fun to have a twitter comment with a zillion likes asking Sam Altman this question. Maybe someone should make one?
Firstly, I'm assuming that high resolution human brain emulation that you can run on a computer is conscious in normal sense that we use in conversations. Like, it talks, has memories, makes new memories, have friends and hobbies and likes and dislikes and stuff. Just like a human that you could talk with only through videoconference type thing on a computer, but without actual meaty human on the other end. It would be VERY weird if this emulation exhibited all these human qualities for other reason than meaty humans exhibit them. Like, very extremely what the fuck surprising. Do you agree?
So, we now have deterministic human file on our hands.
Then, you can trivially make transformer like next token predictor out of human emulation. You just have emulation,...
I don't expect this to "cash out" at all, which is rather the point.
The only really surprising part would be that we had a way to determine for certain whether some other system is conscious or not at all. That is, very similar (high) levels of surprisal for either "ems are definitely conscious" or "ems are definitely not conscious", but the ratio between them not being anywhere near "what the fuck" level.
As it stands, I can determine that I am conscious but I do not know how or why I am conscious. I have only a sample size of 1, and no way to access a lar...