How does it work to optimize for realistic goals in physical environments of which you yourself are a part? E.g. humans and robots in the real world, and not humans and AIs playing video games in virtual worlds where the player not part of the environment. The authors claim we don't actually have a good theoretical understanding of this and explore four specific ways that we don't understand this process.
Firstly, I'm assuming that high resolution human brain emulation that you can run on a computer is conscious in normal sense that we use in conversations. Like, it talks, has memories, makes new memories, have friends and hobbies and likes and dislikes and stuff. Just like a human that you could talk with only through videoconference type thing on a computer, but without actual meaty human on the other end. It would be VERY weird if this emulation exhibited all these human qualities for other reason than meaty humans exhibit them. Like, very extremely what the fuck surprising. Do you agree?
So, we now have deterministic human file on our hands.
Then, you can trivially make transformer like next token predictor out of human emulation. You just have emulation,...
Humans come to reflect on their thoughts on their own without being prompted into it (at least I have heard some anecdotal evidence for it and I also did discover this myself as a kid). The test would be it LLMs would come up with such insights without being trained on text describing the phenomenon. It would presumably involve some way to observe your own thoughts (or some alike representation). The existing context window seems to be too small for that.
Oh great, thanks!
A friend has spent the last three years hounding me about seed oils. Every time I thought I was safe, he’d wait a couple months and renew his attack:
“When are you going to write about seed oils?”
“Did you know that seed oils are why there’s so much {obesity, heart disease, diabetes, inflammation, cancer, dementia}?”
“Why did you write about {meth, the death penalty, consciousness, nukes, ethylene, abortion, AI, aliens, colonoscopies, Tunnel Man, Bourdieu, Assange} when you could have written about seed oils?”
“Isn’t it time to quit your silly navel-gazing and use your weird obsessive personality to make a dent in the world—by writing about seed oils?”
He’d often send screenshots of people reminding each other that Corn Oil is Murder and that it’s critical that we overturn our lives...
I would dissuade no one from writing drunk, and I'm confident that you too can say that people are penguins! But I'm sorry to report that personally I don't do it by drinking but rather writing a much longer version with all those kinds of clarifications included and then obsessively editing it down.
If you don't need 12 tubes of superglue, dollar stores often carry 4 tiny tubes for a buck or so.
I'm glad that superglue is working for you! I personally find that a combination of sharp nail clippers used at the first sign of a hangnail, and keeping my hands moisturized, works for me. Flush cutters of the sort you'd use to trim the sprues off of plastic models are also amazing for removing proto-hangnails without any jagged edge.
Another trick to avoiding hangnails is to prevent the cuticles from growing too long, by pushing them back regularly. I personal...
Post for a somewhat more general audience than the modal LessWrong reader, but gets at my actual thoughts on the topic.
In 2018 OpenAI defeated the world champions of Dota 2, a major esports game. This was hot on the heels of DeepMind’s AlphaGo performance against Lee Sedol in 2016, achieving superhuman Go performance way before anyone thought that might happen. AI benchmarks were being cleared at a pace which felt breathtaking at the time, papers were proudly published, and ML tools like Tensorflow (released in 2015) were coming online. To people already interested in AI, it was an exciting era. To everyone else, the world was unchanged.
Now Saturday Night Live sketches use sober discussions of AI risk as the backdrop for their actual jokes, there are hundreds...
You mean, "ban superintelligence"? Because superintelligences are not human-like.
The kind of superintelligence that doesn't possess human-likeness that we want it to possess.
That's the problem with your proposal of "ethics module". Let's suppose that we have system of "ethics module" and "nanotech design module". Nanotech design module outputs 3D-model of supramolecular unholy abomination. What exactly should ethics module do to ensure that this abomination doesn't kill everyone?
Nanotech design module has to be evaluatable by the ethics module. For that it...
If it’s worth saying, but not worth its own post, here's a place to put it.
If you are new to LessWrong, here's the place to introduce yourself. Personal stories, anecdotes, or just general comments on how you found us and what you hope to get from the site and community are invited. This is also the place to discuss feature requests and other ideas you have for the site, if you don't want to write a full top-level post.
If you're new to the community, you can start reading the Highlights from the Sequences, a collection of posts about the core ideas of LessWrong.
If you want to explore the community more, I recommend reading the Library, checking recent Curated posts, seeing if there are any meetups in your area, and checking out the Getting Started section of the LessWrong FAQ. If you want to orient to the content on the site, you can also check out the Concepts section.
The Open Thread tag is here. The Open Thread sequence is here.
So, I have three very distinct ideas for projects that I'm thinking about applying to the Long Term Future Fund for. Does anyone happen to know if it's better to try to fit them all into one application, or split them into three separate applications?
I was recently asked about my opinion on various schools of Political Philosophy (vg. classical liberalism, neoliberalism and Ayn Rand's Objectivism). I refused to engage with any of them in detail, because my position is that there is no room for different schools of “Political Philosophy”. Ethics and Science (mainly Social Science) are enough to completely determine the best public action.
To develop this idea, I am going to divide the field of political science in three layers: i) Social Welfare definition: what is the ethical objective for political choice, ii) Policy Making: how Science (mainly Social Science) and Ethics combine to generate optimal policies, and iii) Institutional Design: which institutional mechanisms consistently generate the best flow of policies.
Although at individual level there is a trade-off between our...
You are conflating subjective as in "by subjects" with subjective as in "for subjects". A subject can have preferences for objectivity, universality, impartiallity, etc.
Yeah, I saw your other replies in another thread and I was able to test it myself later today and yup it's most likely that it's OpenAI's new LLM. I'm just still confused why call such gpt2.
In Clarifying the Agent-Like Structure Problem (2022), John Wentworth describes a hypothetical instance of what he calls a selection theorem. In Scott Garrabrant's words, the question is, does agent-like behavior imply agent-like architecture? That is, if we take some class of behaving things and apply a filter for agent-like behavior, do we end up selecting things with agent-like architecture (or structure)? Of course, this question is heavily under-specified. So another way to ask this is, under which conditions does agent-like behavior imply agent-like structure? And, do those conditions feel like they formally encapsulate a naturally occurring condition?
For the Q1 2024 cohort of AI Safety Camp, I was a Research Lead for a team of six people, where we worked a few hours a week to better understand...
There is a specific part of this problem that I'm very interested in and that is about looking at the boundaries of potential sub-agents. It feels like part of the goal here is to filter away potential "daemons" or inner optimisers so it feels kind of important to think of ways one can do this?
I can see how this project would be valuable even without it but do you have any thoughts about how you can differentiate between different parts of a system that's acting like an agent to isolate the agentic part?
I otherwise find it a very interesting research direction.