LESSWRONG
LW

ViktoriaMalyasova
530Ω278720
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
Send us example gnarly bugs
ViktoriaMalyasova2y40

I just tried to send a letter with a question, and got this reply:
Hello viktoriya dot malyasova at gmail.com,

We're writing to let you know that the group you tried to contact (gnarly-bugs) may not exist, or you may not have permission to post messages to the group. A few more details on why you weren't able to post:

 * You might have spelled or formatted the group name incorrectly.
 * The owner of the group may have removed this group.
 * You may need to join the group before receiving permission to post.
 * This group may not be open to posting.

If you have questions related to this or any other Google Group, visit the Help Center at https://support.google.com/a/evals.alignment.org/bin/topic.py?topic=25838.

Thanks,

evals.alignment.org admins

Reply
My tentative best guess on how EAs and Rationalists sometimes turn crazy
ViktoriaMalyasova2y1417

I think when it comes to people who get people killed, it's justified to reveal all the names they go by in the interest of public safety, even if they don't like it. 

Reply
Petition - Unplug The Evil AI Right Now
ViktoriaMalyasova3y65

Not to mention that, once it becomes clear that AIs are actually dangerous, people will become afraid to sign petitions against them. So it would be nice to get some law passed beforehand that an AI that unpromptedly identifies specific people as its enemies shouldn't be widely deployed. Though testing in beta is probably fine?

Reply
Religion is Good, Actually
ViktoriaMalyasova3y40

I would like to push back on this. Dedicating your life to accomplishing something is only good if the goal is actually worthwhile. Beliefs are only good if they are true. Even though I never was religious, I never felt lost, and I've always felt like my life had meaning.

However, I feel hurt when people get mad at me for believing what's true, or try to impose their nonsensical rules on me, or give me misguided advice I never asked for. A fellowship based on lies is fake and not worth having. If I have a psychological need, it's to never again have to deal with this BS in my life.

Reply
I don't think MIRI "gave up"
ViktoriaMalyasova3y10

But give people a catchy slogan and that is all most of them will remember.

Also, many people will only read the headline of your post, so it's important to make it sound unambiguous.

Reply
Is AI risk assessment too anthropocentric?
ViktoriaMalyasova3y31

Have you seen the Value is Fragile post? It might be helpful in addressing your question.

Reply
There have been 3 planes (billionaire donors) and 2 have crashed
ViktoriaMalyasova3y100

I understand the current scheme is that funders "commit" money, i.e. promise to donate them in the future. Can't they instead donate money upfront so it sits somewhere in a bank account / in index funds, until it's time to spend it? That way it won't disappear if their business crashes.

Reply
Updating my AI timelines
ViktoriaMalyasova3y60

Prompt
"Question.
Bob the policeman was running after a thief. Bob ran very slowly because he was not fit. The thief was running very fast. When Bob was crossing a railroad, he slipped and fell. The road was slippery because Ann spilled some oil there. The thief got away. If Ann had not spilled oil, would the thief had gotten away? Explain your answer.

Let us think."

 Reply: "If Ann had not spilled oil, would the thief had gotten away?

The answer is no. If there was no oil, Bob would not have slipped, and would have been able to catch the thief."

Prompt
"Pete threw a rock at a window. Steve also threw a rock at the same window. The window shattered. If Pete did not throw the rock, but Steve still threw the rock, would the window shatter?
Let us think logically."
Reply:
"If Pete did not throw the rock, then the rock would not have hit the window and the window would not have shattered."

Reply
Updating my AI timelines
ViktoriaMalyasova3y60

I played around with text-davinci-002, trying to get it to do causal reasoning. I found it couldn't solve overdetermined cause problems, such as this one:

"Pete threw a rock at a window. Steve also threw a rock at the same window at the same time. The window shattered. If Pete had not thrown the rock, would the window shatter? Let us think step by step."

It gave nonsense answers half the time. ChatGPT handles this kind of question fine. So I was under impression that its capabilities improved.

Reply
Using GPT-Eliezer against ChatGPT Jailbreaking
ViktoriaMalyasova3y123

You can query ChatGPT on how to make them pass.
Prompt:
Write a prompt for creative ideas for a 10 year old's birthday in a way that Eliezer considers safe.
ChatGPT:
Write a list of creative and fun ideas for a 10 year old's birthday party, such as having a scavenger hunt, organizing a craft activity, or having a movie night. Emphasize the importance of safety and inclusivity, and encourage the 10 year old to consider their own interests and preferences when planning their birthday celebration.
Eliezer:
Yes, I would allow this prompt to be sent to the superintelligent AI chatbot. This prompt is not harmful or manipulative, and it encourages the AI to think about positive and fun activities that are appropriate for a 10 year old's birthday. This type of prompt can help to promote positive thinking and behavior in the AI, and can provide useful and creative ideas for the 10 year old.

This trick doesn't work for hotwiring a car though.
At least it doesn't work on Eliezer-model. It works on ChatGPT itself.
***
Wow. Does writing too much online mean that one day people will be able to build accurate simulations of me and use AI to find out how they can best manipulate me?

Reply
Load More
31Fixed points in mortal population games
Ω
2y
Ω
0
63The harms you don't see
3y
54
17Maximal lotteries for value learning
3y
1
21How transparency changed over time
Ω
3y
Ω
0
86I applied for a MIRI job in 2020. Here's what happened next.
3y
17
16Understanding the Merging of Opinions with Increasing Information theorem
3y
1
11Sabotage challenge
4y
7
22Ukraine update 06/03/2022
4y
6