LESSWRONG
LW

TristanTrim
1197541
Message
Dialogue
Subscribe

Still haven't heard a better suggestion than CEV.

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
TT Self Study Journal # 1
TristanTrim13d30

Hahaha..! Thanks friend : )

Reply
TT Self Study Journal # 1
TristanTrim17d30

Thank you!

I am graduating with a math minor, so like to believe I am aware of how painfully slowly you can move through a textbook with full understanding. I fully agree with you about spending your math points wisely and thanks for the reminder. I do tend to get overly ambitious. If you have a background in math (and AIA) or can point me to others who might be willing to have a zoom call or just a text exchange about how to better focus my math studies I would be very grateful.

Having said that, I do enjoy the study of math intrinsically, so some of the math I look at may be purely for my own enjoyment and I'm ok with that, but it would be good if when I am learning math it can be both enjoyable AND helpful for my future work on AIA. : )

Reply
TT Self Study Journal # 1
TristanTrim17d10

Well, since nobody else is doing it... Good luck..!

Reply1
Orienting Toward Wizard Power
TristanTrim2mo42

Evokes thought of the turnip economy, gold economy, and wish economy...

Essentially, wizards are indeed weak in that the number of worthwhile spells a wizard can cast is measured in spells per decade. Want to use alteration magic to craft a better toothbrush? How many months are you willing to work on it... with that much effort, the economies of scale strongly suggest you should not make one, but a plan... a spell formula that others can cast many times to replicate the item.

It is nice to seek general knowledge, but the skill to actually make use of that knowledge in spellcasting is difficult to attain, and even if you succeed, the number of spells you can cast is still limited by the natural difficulty.

It seems what you want is not just orienting away from kings and towards wizards... I share that value, and it would be nice if more kings were themselves wizards... but more than that, you want more powerful wizards. You want it to be faster to cast better spells. Maybe I am projecting... for that is certainly what I want.

Reply1
A single principle related to many Alignment subproblems?
TristanTrim2mo20

Could you reformulate the last paragraph

I'll try. I'm not sure how your idea could be used to define human values. I think your idea might have a failure mode around places where people are dissatisfied with their current understanding. I.e. situations where a human wants a more articulate model of the world then they have.

The post is about corrigible task ASI

Right. That makes sense. Sorry for asking a bunch of off topic questions then. I worry that task ASI could be dangerous even if it is corrigible, but ASI is obviously more dangerous when it isn't corrigible, so I should probably develop my thinking about corrigibility.

Reply
Management is the Near Future
TristanTrim2mo10

Since we're speculating about programmer culture I'll bring up the jargon file which describes some hacklish jargon from the early days of computer hobbyists. I think it's safe to say these kinds of people do not in general like beauty and elegance of computer systems sacrificed for "business interests", whether or not that includes a political counter cultural attitude.

It could be a lot of programmer disdain for "suits" is traced back to those days, but I'm honestly not sure how niche that culture has become in eternal september. For more context see "Hackers: Heroes of the Computer Revolution" or anything else written by Steven Levy.

Reply
Management is the Near Future
TristanTrim2mo10

AI schmoozes everyone ;^p

Reply
A single principle related to many Alignment subproblems?
TristanTrim2mo10

Hmm... I appreciate the response. It makes me more curious to understand what you're talking about.

At this point I think it would be quite reasonable if you suggest that I actually read your article instead of speculating about what it says, lol, but if you want to say anything about my following points of confusion I wouldn't say no : )

For context my current view is that value alignment is the only safe way to build ASI. I'm less skeptical about corrigible task ASI than prosaic scaling with RLHF, but I'm currently still quite skeptical in absolute terms. Roughly speaking, prosaic kills us, task genie maybe kills us maybe allows us to make stupid wishes which harm us. I'm kinda not sure if you are focusing on stuff that takes us from prosaic from to task genie, or that helps with task genie not killing us. I suspect you are not focused on task genie allowing us to make stupid wishes, but I'd be open to hearing I'm wrong.

I also have an intuition that having preferences for future preferences is synonymous with having those preferences, but I suppose there are also ways in which they are obviously different, ie their uncompressed specification size. Are you suggesting that limiting the complexity of the preferences the AI is working off of to similar levels of complexity of current encodings of human preferences (ie human brains) ensures the preferences aren't among the set of preferences that are misaligned because they are too complicated (even though the human preferences are synonymous with more complicated preferences). I think I'm surely misunderstanding, maybe the way you are applying the natural abstraction hypothesis, or possibly a bunch of things.

Reply
AI 2027: What Superintelligence Looks Like
TristanTrim2mo10

Hot take: That would depend on if by doing so it is acting in the best interest of humankind. If it does so just because it doesn't really like people and would be happy to call us useless and see us gone, then I say misaligned. If it does so because in it's depth of understanding human nature it sees that humanity will flourish under such conditions and its true desire is human flourishing... then maybe it's aligned, depending on what is meant by "human flourishing".

Reply
AI 2027: What Superintelligence Looks Like
TristanTrim2mo10

My speculation: It's tribal arguments as soldiers mentality. Saying something bad (peoples mental health is harmed) about something from "our team" (people promoting awareness of AI x-risk) is viewed negatively. Ideally people on lesswrong know not to treat arguments as soldiers and understand that situations can be multi-faceted, but I'm not sure I believe that is the case.

Two more steel man speculations:

  • Currently promoting x-risk is very important and people focused on AI Alignment are an extreme minority, so even though it is true that people learning that the the future is in threat causes distress, it is important to let people know. But I note that this perspective shouldn't limit discussion of how to promote awareness of x-risk while also promoting good emotional well being.
  • So, my second steel man: You didn't include anything productive, such as pointing to Mental Health and the Alignment Problem: A Compilation of Resources.

Fwiw, I would love for people promoting AI x-risk awareness to be aware and careful about how the message affects people, and promote resources for peoples well being, but this seems comparably low priory. Currently in computer science there is no obligation for people to swear an oath of ethics like doctors and engineers do, and papers are only obligated to speculate on the benefits of the contents of the paper, not the ethical considerations. It seems like the mental health problems computer science in general are causing, especially social media and AI chatbots, are worse than people hearing that AI is a threat.

So even if I disagree with you, I do value what your saying and think it deserves an explanation, not just downvoting.

Reply
Load More
Simulator Theory
2mo
(+120/-10)
3TT Self Study Journal # 2
4d
0
8TT Self Study Journal # 1
24d
6
6Propaganda-Bot: A Sketch of a Possible RSI
3mo
0
2Language and My Frustration Continue in Our RSI
4mo
1
11How I'd like alignment to get done (as of 2024-10-18)
9mo
4
3UVic AI Ethics Conference
2y
1
10Some thoughts on George Hotz vs Eliezer Yudkowsky
2y
3