How does it work to optimize for realistic goals in physical environments of which you yourself are a part? E.g. humans and robots in the real world, and not humans and AIs playing video games in virtual worlds where the player not part of the environment.  The authors claim we don't actually have a good theoretical understanding of this and explore four specific ways that we don't understand this process.

avturchin21h15-3
14
ChatGPT 4.5 is on preview at https://chat.lmsys.org/ under name gpt-2.  It calls itself ChatGPT 2.0 in a text art drawing https://twitter.com/turchin/status/1785015421688799492 
Raemon2d236
0
Yesterday I was at a "cultivating curiosity" workshop beta-test. One concept was "there are different mental postures you can adopt, that affect how easy it is not notice and cultivate curiosities." It wasn't exactly the point of the workshop, but I ended up with several different "curiosity-postures", that were useful to try on while trying to lean into "curiosity" re: topics that I feel annoyed or frustrated or demoralized about. The default stances I end up with when I Try To Do Curiosity On Purpose are something like: 1. Dutiful Curiosity (which is kinda fake, although capable of being dissociatedly autistic and noticing lots of details that exist and questions I could ask) 2. Performatively Friendly Curiosity (also kinda fake, but does shake me out of my default way of relating to things. In this, I imagine saying to whatever thing I'm bored/frustrated with "hullo!" and try to acknowledge it and and give it at least some chance of telling me things) But some other stances to try on, that came up, were: 3. Curiosity like "a predator." "I wonder what that mouse is gonna do?" 4. Earnestly playful curiosity. "oh that [frustrating thing] is so neat, I wonder how it works! what's it gonna do next?" 5. Curiosity like "a lover". "What's it like to be that you? What do you want? How can I help us grow together?" 6. Curiosity like "a mother" or "father" (these feel slightly different to me, but each is treating [my relationship with a frustrating thing] like a small child who is bit scared, who I want to help, who I am generally more competent than but still want to respect the autonomy of." 7. Curiosity like "a competent but unemotional robot", who just algorithmically notices "okay what are all the object level things going on here, when I ignore my usual abstractions?"... and then "okay, what are some questions that seem notable?" and "what are my beliefs about how I can interact with this thing?" and "what can I learn about this thing that'd be useful for my goals?"
decision theory is no substitute for utility function some people, upon learning about decision theories such as LDT and how it cooperates on problems such as the prisoner's dilemma, end up believing the following: > my utility function is about what i want for just me; but i'm altruistic (/egalitarian/cosmopolitan/pro-fairness/etc) because decision theory says i should cooperate with other agents. decision theoritic cooperation is the true name of altruism. it's possible that this is true for some people, but in general i expect that to be a mistaken analysis of their values. decision theory cooperates with agents relative to how much power they have, and only when it's instrumental. in my opinion, real altruism (/egalitarianism/cosmopolitanism/fairness/etc) should be in the utility function which the decision theory is instrumental to. i actually intrinsically care about others; i don't just care about others instrumentally because it helps me somehow. some important aspects that my utility-function-altruism differs from decision-theoritic-cooperation includes: * i care about people weighed by moral patienthood, decision theory only cares about agents weighed by negotiation power. if an alien superintelligence is very powerful but isn't a moral patient, then i will only cooperate with it instrumentally (for example because i care about the alien moral patients that it has been in contact with); if cooperating with it doesn't help my utility function (which, again, includes altruism towards aliens) then i won't cooperate with that alien superintelligence. corollarily, i will take actions that cause nice things to happen to people even if they've very impoverished (and thus don't have much LDT negotiation power) and it doesn't help any other aspect of my utility function than just the fact that i value that they're okay. * if i can switch to a better decision theory, or if fucking over some non-moral-patienty agents helps me somehow, then i'll happily do that; i don't have goal-content integrity about my decision theory. i do have goal-content integrity about my utility function: i don't want to become someone who wants moral patients to unconsentingly-die or suffer, for example. * there seems to be a sense in which some decision theories are better than others, because they're ultimately instrumental to one's utility function. utility functions, however, don't have an objective measure for how good they are. hence, moral anti-realism is true: there isn't a Single Correct Utility Function. decision theory is instrumental; the utility function is where the actual intrinsic/axiomatic/terminal goals/values/preferences are stored. usually, i also interpret "morality" and "ethics" as "terminal values", since most of the stuff that those seem to care about looks like terminal values to me. for example, i will want fairness between moral patients intrinsically, not just because my decision theory says that that's instrumental to me somehow.
The cost of goods has the same units as the cost of shipping: $/kg. Referencing between them lets you understand how the economy works, e.g. why construction material sourcing and drink bottling has to be local, but oil tankers exist. * An iPhone costs $4,600/kg, about the same as SpaceX charges to launch it to orbit. [1] * Beef, copper, and off-season strawberries are $11/kg, about the same as a 75kg person taking a three-hour, 250km Uber ride costing $3/km. * Oranges and aluminum are $2-4/kg, about the same as flying them to Antarctica. [2] * Rice and crude oil are ~$0.60/kg, about the same as $0.72 for shipping it 5000km across the US via truck. [3,4] Palm oil, soybean oil, and steel are around this price range, with wheat being cheaper. [3] * Coal and iron ore are $0.10/kg, significantly more than the cost of shipping it around the entire world via smallish (Handysize) bulk carriers. Large bulk carriers are another 4x more efficient [6]. * Water is very cheap, with tap water $0.002/kg in NYC. But shipping via tanker is also very cheap, so you can ship it maybe 1000 km before equaling its cost. It's really impressive that for the price of a winter strawberry, we can ship a strawberry-sized lump of coal around the world 100-400 times. [1] iPhone is $4600/kg, large launches sell for $3500/kg, and rideshares for small satellites $6000/kg. Geostationary orbit is more expensive, so it's okay for GPS satellites to cost more than an iPhone per kg, but Starlink wants to be cheaper. [2] https://fred.stlouisfed.org/series/APU0000711415. Can't find numbers but Antarctica flights cost $1.05/kg in 1996. [3] https://www.bts.gov/content/average-freight-revenue-ton-mile [4] https://markets.businessinsider.com/commodities [5] https://www.statista.com/statistics/1232861/tap-water-prices-in-selected-us-cities/ [6] https://www.researchgate.net/figure/Total-unit-shipping-costs-for-dry-bulk-carrier-ships-per-tkm-EUR-tkm-in-2019_tbl3_351748799
Anyone paying attention to the mystery of the GPT-2 chatbot that has appeared on lmsys? People are saying it operates at levels comparable to or exceeding GPT-4. I'm writing because I think the appearance of mysterious unannounced chatbots for public use without provenance makes me update my p(doom) upward. Possibilities: 1. this is a OpenAI chatbot based on GPT-4, just like it says it is. It has undergone some more tuning and maybe has boosted reasoning because of methods described in one of the more recently published papers 2. this is another big American AI company masquarading OpenAI 3. this is a big Chinese AI company masquerading as OpenAI 4. this is an anonymous person or group who is using some GPT-4 fine tune API to improve performance Possibility 1 seems most likely. If that is the case, I guess it is alright, assuming it is purely based on GPT-4 and isn't a new model. I suppose if they wanted to test on lmsys to gauge performance anonymously, they couldn't slap 4.5 on it, but they also couldn't ethically give it the name of another company's model. Giving it an entirely new name would invite heavy suspicion. So calling it the name of an old model and monitoring how it does in battle seems like the most ethical compromise. Still, even labeling a model with a different name feels deceptive. Possibility 2 would be extremely unethical and I don't think it is the case. Also, the behavior of the model looks more like GPT-4 than another model. I expect lawsuits if this is the case. Possibility 3 would be extremely unethical, but is possible. Maybe they trained a model on many GPT-4 responses and then did some other stuff. Stealing a model in this way would probably accelerate KYC legislation and yield outright bans on Chinese rental of compute. If this is the case, then there is no moat because we let our moat get stolen. Possibility 4 is a something someone mentioned in Twitter. I don't know whether it is viable. In any case, releasing models in disguise onto the Internet lowers my expectations for companies to behave responsibly and transparently. It feels a bit like Amazon and their scheme to collect logistics data from competitors by calling itself a different name. In that case, like this, the facade was paper thin...the headquarters of the fake company was right next to Amazon, but it worked for a long while. Since I think 1 is the mostly likely, I believe OpenAI wants to make sure it soundly beats everyone else in the rankings before releasing an update with improvements. But didn't they just release an update a few weeks ago? Hmm.

Popular Comments

Recent Discussion

Firstly, I'm assuming that high resolution human brain emulation that you can run on a computer is conscious in normal sense that we use in conversations. Like, it talks, has memories, makes new memories, have friends and hobbies and likes and dislikes and stuff. Just like a human that you could talk with only through videoconference type thing on a computer, but without actual meaty human on the other end. It would be VERY weird if this emulation exhibited all these human qualities for other reason than meaty humans exhibit them. Like, very extremely what the fuck surprising. Do you agree?

So, we now have deterministic human file on our hands. 

Then, you can trivially make transformer like next token predictor out of human emulation. You just have emulation,...

Humans come to reflect on their thoughts on their own without being prompted into it (at least I have heard some anecdotal evidence for it and I also did discover this myself as a kid). The test would be it LLMs would come up with such insights without being trained on text describing the phenomenon. It would presumably involve some way to observe your own thoughts (or some alike representation). The existing context window seems to be too small for that.

6the gears to ascension2h
I asked claude-3-opus at temperature 1 to respond to this, so that people who don't talk to claude can get a sense of claude's unusual-for-today's-AIs response to this topic. The temperature 1 is due to increased eloquence at temp 1. me: Claude-3-opus-temp-1:
1weightt an13m
Good point, Claude, yeah. Quite alien indeed, maybe more parsimonious. This is exactly what I meant by possibility of this analogy being overridden by actually digging into your brain, digging into a human one and developing actually technical gears-level models of both and then comparing them. Until then, who knows, I'm leaning toward healthy dose of uncertainty. Also, thanks for the comment.
2ryan_greenblatt6h
Rob Long works on these topics.

Oh great, thanks!

This is a linkpost for https://dynomight.net/seed-oil/

A friend has spent the last three years hounding me about seed oils. Every time I thought I was safe, he’d wait a couple months and renew his attack:

“When are you going to write about seed oils?”

“Did you know that seed oils are why there’s so much {obesity, heart disease, diabetes, inflammation, cancer, dementia}?”

“Why did you write about {meth, the death penalty, consciousness, nukes, ethylene, abortion, AI, aliens, colonoscopies, Tunnel Man, Bourdieu, Assange} when you could have written about seed oils?”

“Isn’t it time to quit your silly navel-gazing and use your weird obsessive personality to make a dent in the world—by writing about seed oils?”

He’d often send screenshots of people reminding each other that Corn Oil is Murder and that it’s critical that we overturn our lives...

I would dissuade no one from writing drunk, and I'm confident that you too can say that people are penguins! But I'm sorry to report that personally I don't do it by drinking but rather writing a much longer version with all those kinds of clarifications included and then obsessively editing it down.

2sapphire9h
Strong upvoted. I learned a lot. Seriously interested in what you think is relatively safe and not extremely expensive or difficult to acquire. Some candidates I thought of but im not exactly well informed: -- Grass fed beef -- oysters/muscles -- some whole grains? which? -- fruit -- vegetables you somehow know arent contaminated by anti-pest chemicals? I really need some guidance here.
2JenniferRM13h
This bit caught my eye: I searched for [is olive oil cut with canola oil] and found that in the twenty teens organized crime was flooding the market with fake olive oil, but in 2022 an EU report suggested that uplabeling to "extra virgin" was the main problem they caught (still?). Coming from the other direction, in terms of a "solid safe cheap supply"... I can find reports of Extra Virgin Olive Oil being sold by Costco under their Kirkland brand that is particularly well sourced and tested, and my priors say that this stuff is likely to be weirdly high quality for a weirdly low price (because, in general, "kirklandization" is a thing that food producers with a solid product and huge margins worry about). I'm kinda curious if you have access to Kirkland EVOO and if it gives you "preflux"? Really any extra data here (where your sensitive palate gives insight into the current structure of the food economy) would be fascinating :-)
1Ann16h
Thanks for the reference! I'm definitely confused about the inclusion of "pre-prepared (packaged) meat, fish and vegetables" on the last list, though. Does cooking meat or vegetables before freezing it (rather than after? I presume most people aren't eating meat raw) actually change its processed status significantly?
2Thomas Kwa8h
Hangnails are Largely Optional Hangnails are annoying and painful, and most people deal with them poorly. [1] Instead, use a drop of superglue to glue it to your nail plate. It's $10 for 12 small tubes on Amazon. Superglue is also useful for cuts and minor repairs, so I already carry it around everywhere. Hangnails manifest as either separated nail fragments or dry peeling skin on the paronychium (area around the nail). In my experience superglue works for nail separation, and a paper (available free on Scihub) claims it also works for peeling skin on the paronychium. Is this safe? Cyanoacrylate glue is regularly used in medicine to close wounds, and now frequently replaces stitches. Medical superglue has slightly different types of cyanoacrylate, but doctors I know say it's basically the same thing. I think medical superglue exists to prevent rare reactions and for large wounds where the exothermic reaction from a large quantity might burn you, and the safety difference for hangnails is minimal [2]. But to be extra safe you could just use 3M medical grade superglue or Dermabond. [1]: Typical responses to hangnails include: * Pulling them out, which can lead to further bleeding or infection. * Trimming them with nail clippers, which often leaves a jagged edge. * Wrapping the affected finger in a bandage, requiring daily changes. [2]: There have been studies showing cytotoxicity in rabbits when injecting it in their eyes, or performing internal (bone or cartilage) grafts. A 2013 review says that although some studies have found internal toxicity, "[f]or wound closure and various other procedures, there have been a considerable number of studies finding histologic equivalence between ECA [commercial superglue] and more widely accepted modalities of repair."
nim1h20

If you don't need 12 tubes of superglue, dollar stores often carry 4 tiny tubes for a buck or so.

I'm glad that superglue is working for you! I personally find that a combination of sharp nail clippers used at the first sign of a hangnail, and keeping my hands moisturized, works for me. Flush cutters of the sort you'd use to trim the sprues off of plastic models are also amazing for removing proto-hangnails without any jagged edge.

Another trick to avoiding hangnails is to prevent the cuticles from growing too long, by pushing them back regularly. I personal... (read more)

Post for a somewhat more general audience than the modal LessWrong reader, but gets at my actual thoughts on the topic.

In 2018 OpenAI defeated the world champions of Dota 2, a major esports game. This was hot on the heels of DeepMind’s AlphaGo performance against Lee Sedol in 2016, achieving superhuman Go performance way before anyone thought that might happen. AI benchmarks were being cleared at a pace which felt breathtaking at the time, papers were proudly published, and ML tools like Tensorflow (released in 2015) were coming online. To people already interested in AI, it was an exciting era. To everyone else, the world was unchanged.

Now Saturday Night Live sketches use sober discussions of AI risk as the backdrop for their actual jokes, there are hundreds...

1quetzal_rainbow7h
I feel like I am a victim of transparency illusion. First part of OP argument is "LLMs need data, data is limited and synthetic data is meh". Direct counterargument to this is "here is how to avoid drawbacks of sythetic data". Second part of OP argument is "LLMs are humanlike and will remain so", and direct counterargument is "here is how to make LLMs more capable but less humanlike, it will be adopted because it makes LLMs more capable". Walking around telling everyone ideas of how to make AI more capable and less alignable is pretty much ill-adviced.
1Ape in the coat7h
Thankfully, this is a class of problems that humanity has an experience dealing with. The solution boils down to regulating all the ways to make LLMs less human-like out of existence.
1quetzal_rainbow6h
You mean, "ban superintelligence"? Because superintelligences are not human-like. That's the problem with your proposal of "ethics module". Let's suppose that we have system of "ethics module" and "nanotech design module". Nanotech design module outputs 3D-model of supramolecular unholy abomination. What exactly should ethics module do to ensure that this abomination doesn't kill everyone? Tell nanotech module "pls don't kill people"? You are going to have hard time translating this into nanotech designer internal language. Make ethics module sufficiently smart to analyse behavior of complex molecular structures in wide range of environments? You have now all problems with alignment of superintelligences.

You mean, "ban superintelligence"? Because superintelligences are not human-like.

The kind of superintelligence that doesn't possess human-likeness that we want it to possess.

That's the problem with your proposal of "ethics module". Let's suppose that we have system of "ethics module" and "nanotech design module". Nanotech design module outputs 3D-model of supramolecular unholy abomination. What exactly should ethics module do to ensure that this abomination doesn't kill everyone?

Nanotech design module has to be evaluatable by the ethics module. For that it... (read more)

If it’s worth saying, but not worth its own post, here's a place to put it.

If you are new to LessWrong, here's the place to introduce yourself. Personal stories, anecdotes, or just general comments on how you found us and what you hope to get from the site and community are invited. This is also the place to discuss feature requests and other ideas you have for the site, if you don't want to write a full top-level post.

If you're new to the community, you can start reading the Highlights from the Sequences, a collection of posts about the core ideas of LessWrong.

If you want to explore the community more, I recommend reading the Library, checking recent Curated posts, seeing if there are any meetups in your area, and checking out the Getting Started section of the LessWrong FAQ. If you want to orient to the content on the site, you can also check out the Concepts section.

The Open Thread tag is here. The Open Thread sequence is here.

So, I have three very distinct ideas for projects that I'm thinking about applying to the Long Term Future Fund for. Does anyone happen to know if it's better to try to fit them all into one application, or split them into three separate applications?

2niplav8h
Thanks, that makes sense.
To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)
Log In Reset Password
...or continue with

I was recently asked about my opinion on various schools of Political Philosophy (vg. classical liberalism, neoliberalism and Ayn Rand's Objectivism). I refused to engage with any of them in detail, because my position is that there is no room for different schools of “Political Philosophy”. Ethics and Science (mainly Social Science) are enough to completely determine the best public action. 

To develop this idea, I am going to divide the field of political science in three layers: i) Social Welfare definition: what is the ethical objective for political choice, ii) Policy Making: how Science (mainly Social Science) and Ethics combine to generate optimal policies, and iii) Institutional Design: which institutional mechanisms consistently generate the best flow of policies.

Although at individual level there is a trade-off between our...

TAG2h20

You are conflating subjective as in "by subjects" with subjective as in "for subjects". A subject can have preferences for objectivity, universality, impartiallity, etc.

2ChristianKl4h
Isn't the main argument that Zvi makes that China is willing to do AI regulation and thus we can also do AI regulation. In that frame the fact that Meta releases it's weights is just regulatory failure on our part. 
1MiguelDev13h
Copy and pasting an entire paper/blog and asking the model to summarize it? - this isn't hard to do, and it's very easy to know if there is enough tokens, just run the text in any BPE tokenizer available online. 
3gwern2h
Sure, the poem prompt I mentioned using is like 3500 characters all on its own, and it had no issues repeatedly revising and printing out 4 new iterations of the poem without apparently forgetting when I used up my quota yesterday, so that convo must've been several thousand BPEs.

Yeah, I saw your other replies in another thread and I was able to test it myself later today and yup it's most likely that it's OpenAI's new LLM. I'm just still confused why call such gpt2.

In Clarifying the Agent-Like Structure Problem (2022), John Wentworth describes a hypothetical instance of what he calls a selection theorem. In Scott Garrabrant's words, the question is, does agent-like behavior imply agent-like architecture? That is, if we take some class of behaving things and apply a filter for agent-like behavior, do we end up selecting things with agent-like architecture (or structure)? Of course, this question is heavily under-specified. So another way to ask this is, under which conditions does agent-like behavior imply agent-like structure? And, do those conditions feel like they formally encapsulate a naturally occurring condition?

For the Q1 2024 cohort of AI Safety Camp, I was a Research Lead for a team of six people, where we worked a few hours a week to better understand...

There is a specific part of this problem that I'm very interested in and that is about looking at the boundaries of potential sub-agents. It feels like part of the goal here is to filter away potential "daemons" or inner optimisers so it feels kind of important to think of ways one can do this?

I can see how this project would be valuable even without it but do you have any thoughts about how you can differentiate between different parts of a system that's acting like an agent to isolate the agentic part?

I otherwise find it a very interesting research direction.

LessOnline

A Festival of Writers Who are Wrong on the Internet

May 31 - Jun 2, Berkeley, CA