Jeff argues that people should fill in some of the San Francisco Bay, south of the Dumbarton Bridge, to create new land for housing. This would allow millions of people to live closer to jobs, reducing sprawl and traffic. While there are environmental concerns, the benefits of dense urban housing outweigh the localized impacts.
Epistemic status: I'm aware of good arguments that this scenario isn't inevitable, but it still seems frighteningly likely even if we solve technical alignment. Clarifying this scenario seems important.
TL;DR: (edits in parentheses, two days after posting, from discussions in comments )
I run a weekly sequences-reading meetup with some friends, and I want to add a film-component, where we watch films that have some tie-in to what we've read.
I got to talking with friends about what good rationality films there are. We had some ideas but I wanted to turn it to LessWrong to find out.
So please, submit your rationalist films! Then we can watch and discuss them :-)
Here are the rules for the thread.
Optional extra: List some essays in the sequences that the film connects to. Yes, non-sequences posts by other rationalists like Scott Alexander and Robin Hanson are allowed.
Spoilers
If you are including spoilers for the film, use spoiler tags! Put >! at the start of the paragraph to cover the text, and people can hover-over if they want to read it, like so:
This is hidden text!
Twisted: The Untold Story of a Royal Vizier isn't really rational but is rat-adjacent and funny about it. Available to watch on youtube though the video quality isn't fantastic.
Something I'm worried about now is some RFK Jr/Dr. Oz equivalent being picked to lead on AI...
Translated by Emily Wilson
1.
I didn't know what the Iliad was about. I thought it was the story of how Helen of Troy gets kidnapped, triggering the Trojan war, which lasts a long time and eventually gets settled with a wooden horse.
Instead it's just a few days, nine years into that war. The Greeks are camped on the shores near Troy. Agamemnon, King of the Greeks, refuses to return a kidnapped woman to her father for ransom. (Lots of women get kidnapped.) Apollo smites the Greeks with arrows which are plague, and after a while the other Greeks get annoyed enough to tell Agamemnon off. Achilles is most vocal, so Agamemnon returns that woman but takes one of Achilles' kidnapped women instead.
Achilles gets upset and decides to stop...
My guess: [signalling] is why some people read the Iliad, but it's not the main thing that makes it a classic.
Incidentally, there was one reddit comment that pushed me slightly in the direction of "yep, it's just signalling".
This was obviously not the intended point of that comment. But (ignoring how they misunderstood my own writing), the user
Here's a somewhat wild idea to have a 'canary in a coalmine' when it comes to steganography and non-human (linguistic) representations: monitor for very sharp drops in BrainScores (linear correlations between LM activations and brain measurements, on the same inputs) - e.g. like those calculated in Scaling laws for language encoding models in fMRI. (Ideally using larger, more diverse, higher-resolution brain data.)
You may have heard that you 'shouldn't use screens late in the evening' and maybe even that 'it's good for you to get exposure to sunshine as soon as possible after waking'. For the majority of people, these are generally beneficial heuristics. They are also the extent of most people's knowledge about how light affects their wellbeing.
The multiple mechanisms through which light affects our physiology make it hard to provide generalisable guidance. Among other things, the time of day, your genetics, your age, your mood and the brightness, frequency and duration of exposure to light all interrelate in determining how it affects us.
This document will explain some of the basic mechanisms through which light affects our physiology, with the goal of providing a framework to enable you...
I feel the need to correct part of this post.
Unless otherwise indicated, the following information comes from Andrew Huberman. Most comes from Huberman Lab Podcast #68. Huberman opines on a great many health topics. I want to stress that I don't consider Huberman a reliable authority in general, but I do consider him reliable on the circadian rhythm and on motivation and drive. (His research specialization for many years was the former and he for many years has successfully used various interventions to improve his own motivation and drive, which is very v...
This is the full text of a post from "The Obsolete Newsletter," a Substack that I write about the intersection of capitalism, geopolitics, and artificial intelligence. I’m a freelance journalist and the author of a forthcoming book called Obsolete: Power, Profit, and the Race for Machine Superintelligence. Consider subscribing to stay up to date with my work.
The US-China AI rivalry is entering a dangerous new phase.
Earlier today, the US-China Economic and Security Review Commission (USCC) released its annual report, with the following as its top recommendation:
...Congress establish and fund a Manhattan Project-like program dedicated to racing to and acquiring an Artificial General Intelligence (AGI) capability. AGI is generally defined as
This post seems important-if-right. I get a vibe from it of aiming to persuade more than explain, and I'd be interested in multiple people gathering/presenting evidence about this, preferably at least some of them who are (currently) actively worried about China.
I think AI agents (trained end-to-end) might intrinsically prefer power-seeking, in addition to whatever instrumental drives they gain.
The logical structure of the argument
Premises
- People will configure AI systems to be autonomous and reliable in order to accomplish tasks.
- This configuration process will reinforce & generalize behaviors which complete tasks reliably.
- Many tasks involve power-seeking.
- The AI will complete these tasks by seeking power.
- The AI will be repeatedly reinforced for its historical actions which seek power.
- There is a decent chance the reinforced circuits (“subshards”) prioritize gaining power for the AI’s own sake, not just for the user’s benefit.
Conclusion: There is a decent chance the AI seeks power for itself, when possible.
Read the full post at turntrout.com/intrinsic-power-seeking
Find out when I post more content: newsletter & RSS
Note that I don't generally read or reply to comments on LessWrong. To contact me, email alex@turntrout.com
.
...If there are ‘subshards’ which achieve this desirable behavior because they, from their own perspective, ‘intrinsically’ desire power (whatever that sort of distinction makes when you’ve broken things down that far), and it is these subshards which implement the instrumental drive... so what? After all, there has to be some level of analysis at which an agent stops thinking about whether or not it should do some thing and just starts doing the thing. Your muscles “intrinsically desire” to fire when told to fire, but the motor actions are still ultimately i
Hi all,
roughly one year ago I posted a thread about failed attempts at replicating the first part of Apollo Research's experiment where an LLM agent engages in insider trading despite being explicitly told that it's not approved behavior.
Along with a fantastic team, we did eventually manage. Here is the resulting paper, if anyone is interested; the abstract is pasted below. We did not tackle deception (yet), just the propensity to dispense with basic principles of financial ethics and regulation.
Chat Bankman-Fried: an Exploration of LLM Alignment in Finance
by Claudia Biancotti, Carolina Camassa, Andrea Coletta, Oliver Giudice, and Aldo Glielmo (Bank of Italy)
Advances in large language models (LLMs) have renewed concerns about whether artificial intelligence shares human values-a challenge known as the alignment problem. We assess whether various LLMs...
So maybe part of the issue here is just that deducing/understanding the moral/ethical consequences of the options being decided between is a bit inobvious most current models, other than o1? (It would be fascinating to look at the o1 CoT reasoning traces, if only they were available.)
In which case simply including a large body of information on the basics of fiduciary responsibility (say, a training handbook for recent hires in the banking industry, or something) into the context might make a big difference for other models. Similarly, the possible misunde...