LESSWRONG
LW

287
Alice Blair
73820630
Message
Dialogue
Subscribe

Dumping out a lot of thoughts on LW in hopes that something sticks. Eternally upskilling.

I write the ML Safety Newsletter

DMs open, especially for promising opportunities in AI Safety and potential collaborators. I'm maybe interested in helping you optimize the communications of your new project.

  • Crocker's rules
  • Anonymous feedback/criticism/praise/etc
  • Personal Website

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
5Alice Blair's Shortform
9mo
6
Uncommon Utilitarianism #3: Bounded Utility Functions
Alice Blair6d10

Infinite utility functions mean that there is a concrete input such that the output is "infinity", such as "you go to heaven in the Wager scenario". Unbounded utility functions do not necessarily output "infinity" for a particular value. f(x)=x or "count the number of paper clips" is unbounded but at no concrete input does it tell you "infinity".

Reply
Uncommon Utilitarianism #3: Bounded Utility Functions
Alice Blair7d20

I may have indeed made a mistake to frontload the math and thought experiments and put the introspection at the end, rather than centering the introspection and putting the rest in an appendix.

  1. that's not how utility works, utility is the unit of value, and so it doesn't make sense in my ontology to say that they diminish in value.

  2. I don't think I'm anywhere near negative utilitarian enough to empathize with that last point. As I mention in my previous post, I'm quite positive utilitarian.

I don't really have time to digest 2&3 right now, and I find myself confused without reading up on the things you cite.

Reply
Uncommon Utilitarianism #3: Bounded Utility Functions
Alice Blair8d32

This seems like it works but demands a very strange universal prior that penalizes big things and large numbers. I consider the original Pascal's Mugging post to have settled the argument about this type of prior.

Reply
Uncommon Utilitarianism #3: Bounded Utility Functions
Alice Blair8d12

This is very far up, above my hopes for humanity in the good ASI worlds, but not wildly higher than that, I expect. This is not a practical post afaik, and I said so. It is for filling out our conception of utilitarianism, and adding robustness to edge cases can sometimes help with creating useful new frames. Historically, it is the idea that came to me first and inspired me to write the sublinear utility post.

Reply
Sublinear Utility in Population and other Uncommon Utilitarianism
Alice Blair8d20

Here is the post I mentioned which responds to the question of bounded utility functions in much more detail.

Reply
Alice Blair's Shortform
Alice Blair13d20

Update: I tried claude 3 sonnet, 3 opus, 3.7 sonnet, 4 sonnet, and 4 opus, and all of them can repeat back ' ForCanBeConvertedToForeach' just fine, so it's (probably) not just a straightforward porting of glitch tokens to claudes, which updates me a little towards pareidolia.

Reply
Alice Blair's Shortform
Alice Blair13d4-2

Connection I recently made:

  • 4o seems to really like spirals, specifically the "spiral emoji" (🌀), which is actually categorized as a cyclone. This also shows up in claude's spiritual bliss attractor, if I recall correctly.
  • Remember glitch tokens? Someone claimed that they work with GPT-4 as well, since the originals were from GPT-3.5. When this came out, I picked one of them arbitrarily to try: ' ForCanBeConvertedToForeach'. GPT-3.5 consistently interpreted that token as "cyclone" and once "cyber" and GPT-4 had troubles, although not quite the same troubles:

I'm not really sure what is going on with glitch tokens still, and even though ' cyclone' isn't a glitch token itself, I suspect that there is something weird about it that maybe got crystallized in training. Not quite sure why this would show up in Claude, and maybe I'm just latching onto pareidolia.

Reply
Sublinear Utility in Population and other Uncommon Utilitarianism
Alice Blair13d20

The post on why my utility function is bounded is hopefully coming out later this week, and it is in fact an independent point from what this post is talking about. Neither of those muggings sound like they would work. Alas, I don't have all my thoughts written out right here right now, so you shall have to wait.

Reply
Sublinear Utility in Population and other Uncommon Utilitarianism
Alice Blair21d30

Yeah, measure is pretty much what I was trying to get at in this post without trying to get into actually measure. I think a more detailed rewrite of this would maybe go into measure and more math, but that isn't my priority for now. I agree that you can want exactly some constant c number of beings; once again I'm not trying to give the One Objective Morality here, I'm just talking about the shape that my values seem to be when I look at them, and maybe other people will find this useful.

Reply
Sublinear Utility in Population and other Uncommon Utilitarianism
Alice Blair21d10

I don't really understand what you're saying about the relativity point. Also, I'm not trying to say the "correct" way to value things is my way, I'm saying that my way is my way, and I don't think doubling up the transistors is going to do anything that it is coherent to care about.

Reply
Load More
5AISN #65: Measuring Automation and Superintelligence Moratorium Letter
6d
0
16Uncommon Utilitarianism #3: Bounded Utility Functions
8d
10
6Uncommon Utilitarianism #2: Positive Utilitarianism
15d
0
68Sublinear Utility in Population and other Uncommon Utilitarianism
22d
14
20Alignment Faking Demo for Congressional Staffers
1mo
2
21Applied Murphyjitsu Meditation
1mo
0
124IABIED is on the NYT bestseller list
1mo
5
39Warmth, Light, Flame
1mo
0
14Being Handed Puzzles
2mo
1
28Handing People Puzzles
3mo
2
Load More