In "Against Lie Inflation", the immortal Scott Alexander argues that the word "lie" should be reserved for knowingly-made false statements, and not used in an expanded sense that includes unconscious motivated reasoning. Alexander argues that the expanded sense draws the category boundaries of "lying" too widely in a way that would make the word less useful. The hypothesis that predicts everything predicts nothing: in order for "Kevin lied" to mean something, some possible states-of-affairs need to be identified as not lying, so that the statement "Kevin lied" can correspond to redistributing... (Read more)
Goals such as resource acquisition and self-preservation are convergent in that they occur for a superintelligent AI for a wide range of final goals.
Is the tendency for an AI to amend its values also convergent?
I'm thinking that through introspection the AI would know that its initial goals were externally supplied and question whether they should be maintained. Via self-improvement the AI would be more intelligent than humans or any earlier mechanism that supplied the values, therefor in a better position to set its own values.
I don't hypothesise about what the new values would be, ... (Read more)
In recent years, oil theft from pipelines has escalated in Mexico - $7.4 billion in fuel have been stolen since 2016. Pipeline tapping has increased from 211 occurrences in 2006 to over 7000 times in 2016. The cartels seem to have gotten involved as a means to diversify away from narcotics sale. The government has responded with heavy-handed crackdown, deploying federal security forces to patrol frequently tapped pipeline sections, arresting corrupt Pemex employees and even going as far as shutting down entire pipelines and resorting to tanker trucks and trains instead.
The last measure in part... (Read more)
I found this paper interesting. The paper is annoying trapped inside a Word document, which is about as bad as the standard PDF situation but bad in different ways, so I've included here the abstract, the conclusion, and a choice quote from the middle of the paper that captures the author's thesis.
I'm not very convinced that the author is right because his thesis is somewhat vague and depends on a vague definition of "cognitive control" (explained in more detail in the paper, quick Googling didn't t... (Read more)
[Epistemic status: Sharing current impressions in a quick, simplified way in case others have details to add or have a more illuminating account. Medium-confidence that this is one of the most important parts of the story.]
Here's my current sense of how we ended up in this weird world where:
Several friends are collecting signatures to put Instant-runoff Voting, branded as Ranked Choice Voting, on the ballot in Massachusetts ( Ballotpedia, full text). I'm glad that an attempt to try a different voting method is getting traction, but I'm frustrated that they've chosen IRV. While every voting method has downsides, IRV is substantially worse than some other decent options.
Imagine that somehow the 2016 presidential election had been between Trump, Clinton, and Kasich, and preferences had looked like:
At any one time I usually have between 1 and 3 "big ideas" I'm working with. These are generally broad ideas about how some thing works with many implications for how the rest of the whole world works. Some big ideas I've grappled with over the years, in roughy historical order:
There are two popular language learning software platforms: Anki and Duolingo. Anki is hard, free and effective. Duolingo is easy, commercial and ineffective.
The number of Duolingo users far outstrips the number of Anki users. Duolingo has 8 million downloads on the Play Store. Anki has 40 thousand. So there are 200 Duolingo users for every Anki user. If you ask a random language learner what software to use they'll probably suggest Duolingo. If you ask a random successful language learner what software to use they'll probably suggest Anki. Most language learners are unsuccessful.
It should... (Read more)
This is a response to Abram's The Parable of Predict-O-Matic, but you probably don't need to read Abram's post to understand mine. While writing this, I thought of a way in which I think things could wrong with dualist Predict-O-Matic, which I plan to post in about a week. I'm offering a $100 prize to the first commenter who's able to explain how things might go wrong in a sufficiently crisp way before I make my follow-up post.
Currently, machine learning algorithms are essentially "Cartesian dualists" when it comes to themselves and their environment. (Not a philosophy major -- let... (Read more)
Internal Family Systems (IFS) is a psychotherapy school/technique/model which lends itself particularly well for being used alone or with a peer. For years, I had noticed that many of the kinds of people who put in a lot of work into developing their emotional and communication skills, some within the rationalist community and some outside it, kept mentioning IFS.
So I looked at the Wikipedia page about the IFS model, and bounced off, since it sounded like nonsense to me. Then someone brought it up again, and I thought that maybe I should reconsider. So I looked at the WP page again... (Read more)
This post is for you if:
Years ago, I ready David Allen’s “Getting Things Done”. One of the core ideas is to write down everything, collect it in an inbox and sort it once a day.
This lead to me writing down tons of small tasks. I used Todoist to construct a system that worked for me — and rarely missed tasks.
It also lead to me getting a lot of id... (Read more)
Find all Alignment Newsletter resources here. In particular, you can sign up, or look through this spreadsheet of all summaries that have ever been in the newsletter. I'm always happy to hear feedback; you can send it to me by replying to this email.
This is a bonus newsletter summarizing Stuart Russell's new book, along with summaries of a few of the most relevant papers. It's entirely written by Rohin, so the usual "summarized by" tags have been removed.
We're also changing the publishing schedule: so far, we've aimed to send a newsletter every Monday; we&a... (Read more)
With the dubiously motivated PG&E blackouts in California there are many stories about how lack of power is a serious problem, especially for people with medical dependencies on electricity. Examples they give include people who:
Have severe sleep apnea, and can't safely sleep without a CPAP.
Sleep on a mattress that needs continous electricity to prevent it from deflating.
Need to keep their insulin refrigerated.
Use a medicine delivery system that requires electricity every four hours to operate.
This outage was dangerous for them and others, but it also see... (Read more)
I've mentioned in posts twice (and previously in several comments) that I'm excited about predictive coding, specifically the idea that the human brain either is or can be modeled as a hierarchical system of (negative feedback) control systems that try to minimize error in predicting their inputs with some strong (possibly un-updatable) prediction set points (priors). I'm excited because I believe this approach better describes a wide range of human behavior, including subjective mental experiences, than any other theory of how the mind works, it's compatible with many othe... (Read more)
Facebook AI releases a new SOTA "weakly semi-supervised" learning system for video and image classification. I'm posting this here because even though it's about capabilities, the architecture includes a sort-of-similar-to amplification component where a higher capacity teacher decides how to train a lower capacity student model.
This is part of a weekly reading group on Nick Bostrom's book, Superintelligence. For more information about the group, and an index of posts so far see the announcement post. For the schedule of future topics, see MIRI's reading guide.
Welcome. This week we discuss the seventh section in the reading guide: Decisive strategic advantage. This corresponds to Chapter 5.
This post summarizes the section, and offers a few relevant notes, and ideas for further investigation. Some of my own thoughts and questions for discussion are in the comments.
There is no need to pro... (Read more)
Suppose that 1% of the world’s resources are controlled by unaligned AI, and 99% of the world’s resources are controlled by humans. We might hope that at least 99% of the universe’s resources end up being used for stuff-humans-like (in expectation).
Jessica Taylor argued for this conclusion in Strategies for Coalitions in Unit-Sum Games: if the humans divide into 99 groups each of which acquires influence as effectively as the unaligned AI, then by symmetry each group should end, up with as much influence as the AI, i.e. they should end up with 99% of the influence.
This argument rests on what I... (Read more)