LESSWRONG
LW

1249
dsj
789Ω2621040
Message
Dialogue
Subscribe

David Schneider-Joseph

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
The Rationalists of the 1950s (and before) also called themselves “Rationalists”
dsj11d10

There's also the Federmann Center for the Study of Rationality, founded in 1991, where

faculty, students, and guests join forces to explore the rational basis of decision-making. Coming from a broad sweep of departments (mathematics, economics, psychology, biology, education, computer science, philosophy, political science, business, statistics, and law), its members look at how rationality  —  which, in decision-making, means the process by which individuals, groups, firms, plants, and other entities choose the path of maximum benefit  —  responds to real-world situations where individuals with different goals interact.

They say they are inspired by the work of John Aumann and Menahem Yaari.

Reply
Agent foundations: not really math, not really science
dsj22d20

My memory from reading Andrew Hodges’ authoritative biography of Turing is that his theory was designed as a tool to solve the Entscheidungsproblem, which was a pure mathematical problem posed by Hilbert. It just happened to be a convenient formalism for others later on. GPT-5 agrees with me.

Reply
Will Jesus Christ return in an election year?
dsj4mo50

This hypothesis was also proposed in 1998 on a different (play money) prediction market and the galaxy-brained trade succeeded for some in 2002.

Reply1
Did Christopher Hitchens change his mind about waterboarding?
dsj1y60

And mine.

Reply
Maybe Anthropic's Long-Term Benefit Trust is powerless
dsj1y40

I don’t know much background here so I may be off base, but it’s possible that the motivation of the trust isn’t to bind leadership’s hands to avoid profit-motivated decision making, but rather to free their hands to do so, ensuring that shareholders have no claim against them for such actions, as traditional governance structures might have provided.

Reply
Zach Stein-Perlman's Shortform
dsj1y81

(Unless "employees who signed a standard exit agreement" is doing a lot of work — maybe a substantial number of employees technically signed nonstandard agreements.)

Yeah, what about employees who refused to sign? Have we gotten any clarification on their situation?

Reply
Book Review: 1948 by Benny Morris
dsj2y32

Thank you, I appreciated this post quite a bit. There's a paucity of historical information about this conflict which isn't colored by partisan framing, and you seem to be coming from a place of skeptical, honest inquiry. I'd look forward to reading what you have to say about 1967.

Reply
Anthropic Fall 2023 Debate Progress Update
dsj2yΩ175

Thanks for doing this! I think a lot of people would be very interested in the debate transcripts if you posted them on GitHub or something.

Reply
Evaluating the historical value misspecification argument
dsj2y32

Okay. I do agree that one way to frame Matthew’s main point is that MIRI thought it would be hard to specify the human value function, and an LM that understands human values and reliably tells us the truth about that understanding is such a specification, and hence falsifies that belief.

To your second question: MIRI thought we couldn’t specify the value function to do the bounded task of filling the cauldron, because any value function we could naively think of writing, when given to an AGI (which was assumed to be a utility argmaxer), leads to all sorts of instrumentally convergent behavior such as taking over the world to make damn sure the cauldron is really filled, since we forgot all the hidden complexity of our wish.

Reply
Evaluating the historical value misspecification argument
dsj2y32

I think this reply is mostly talking past my comment.

I know that MIRI wasn't claiming we didn't know how to safely make deep learning systems, GOFAI systems, or what-have-you fill buckets of water, but my comment wasn't about those systems. I also know that MIRI wasn't issuing a water-bucket-filling challenge to capabilities researchers.

My comment was specifically about directing an AGI (which I think GPT-4 roughly is), not deep learning systems or other software generally. I *do* think MIRI was claiming we didn't know how to make AGI systems safely do mundane tasks.

I think some of Nate's qualifications are mainly about the distinction between AGI and other software, and others (such as "[i]f the system is trying to drive up the expectation of its scoring function and is smart enough to recognize that its being shut down will result in lower-scoring outcomes") mostly serve to illustrate the conceptual frame MIRI was (and largely still is) stuck in about how an AGI would work: an argmaxer over expected utility.

[Edited to add: I'm pretty sure GPT-4 is smart enough to know the consequences of its being shut down, and yet dumb enough that, if it really wanted to prevent that from one day happening, we'd know by now from various incompetent takeover attempts.]

Reply
Load More
43In defense of the amyloid hypothesis
1mo
0
8How should I behave ≥14 days after my first mRNA vaccine dose but before my second dose?
Q
4y
Q
3