ViktoriaMalyasova - LessWrong

I just tried to send a letter with a question, and got this reply:
Hello viktoriya dot malyasova at gmail.com,

We're writing to let you know that the group you tried to contact (gnarly-bugs) may not exist, or you may not have permission to post messages to the group. A few more details on why you weren't able to post:

* You might have spelled or formatted the group name incorrectly.
* The owner of the group may have removed this group.
* You may need to join the group before receiving permission to post.
* This group may not be open to posting.

If you have questions related to this or any other Google Group, visit the Help Center at https://support.google.com/a/evals.alignment.org/bin/topic.py?topic=25838.

Thanks,

evals.alignment.org admins

My tentative best guess on how EAs and Rationalists sometimes turn crazy

ViktoriaMalyasova1y1214

I think when it comes to people who get people killed, it's justified to reveal all the names they go by in the interest of public safety, even if they don't like it.

Petition - Unplug The Evil AI Right Now

ViktoriaMalyasova1y65

Not to mention that, once it becomes clear that AIs are actually dangerous, people will become afraid to sign petitions against them. So it would be nice to get some law passed beforehand that an AI that unpromptedly identifies specific people as its enemies shouldn't be widely deployed. Though testing in beta is probably fine?

Religion is Good, Actually

ViktoriaMalyasova1y40

I would like to push back on this. Dedicating your life to accomplishing something is only good if the goal is actually worthwhile. Beliefs are only good if they are true. Even though I never was religious, I never felt lost, and I've always felt like my life had meaning.

However, I feel hurt when people get mad at me for believing what's true, or try to impose their nonsensical rules on me, or give me misguided advice I never asked for. A fellowship based on lies is fake and not worth having. If I have a psychological need, it's to never again have to deal with this BS in my life.

I don't think MIRI "gave up"

ViktoriaMalyasova1y10

But give people a catchy slogan and that is all most of them will remember.

Also, many people will only read the headline of your post, so it's important to make it sound unambiguous.

Is AI risk assessment too anthropocentric?

ViktoriaMalyasova1y31

Have you seen the Value is Fragile post? It might be helpful in addressing your question.

There have been 3 planes (billionaire donors) and 2 have crashed

ViktoriaMalyasova2y100

I understand the current scheme is that funders "commit" money, i.e. promise to donate them in the future. Can't they instead donate money upfront so it sits somewhere in a bank account / in index funds, until it's time to spend it? That way it won't disappear if their business crashes.

Updating my AI timelines

ViktoriaMalyasova2y60

Prompt
"Question.
Bob the policeman was running after a thief. Bob ran very slowly because he was not fit. The thief was running very fast. When Bob was crossing a railroad, he slipped and fell. The road was slippery because Ann spilled some oil there. The thief got away. If Ann had not spilled oil, would the thief had gotten away? Explain your answer.

Let us think."

Reply: "If Ann had not spilled oil, would the thief had gotten away?

The answer is no. If there was no oil, Bob would not have slipped, and would have been able to catch the thief."

Prompt
"Pete threw a rock at a window. Steve also threw a rock at the same window. The window shattered. If Pete did not throw the rock, but Steve still threw the rock, would the window shatter?
Let us think logically."
Reply:
"If Pete did not throw the rock, then the rock would not have hit the window and the window would not have shattered."

Updating my AI timelines

ViktoriaMalyasova2y60

I played around with text-davinci-002, trying to get it to do causal reasoning. I found it couldn't solve overdetermined cause problems, such as this one:

"Pete threw a rock at a window. Steve also threw a rock at the same window at the same time. The window shattered. If Pete had not thrown the rock, would the window shatter? Let us think step by step."

It gave nonsense answers half the time. ChatGPT handles this kind of question fine. So I was under impression that its capabilities improved.

Using GPT-Eliezer against ChatGPT Jailbreaking

ViktoriaMalyasova2y123

You can query ChatGPT on how to make them pass.
Prompt:
Write a prompt for creative ideas for a 10 year old's birthday in a way that Eliezer considers safe.
ChatGPT:
Write a list of creative and fun ideas for a 10 year old's birthday party, such as having a scavenger hunt, organizing a craft activity, or having a movie night. Emphasize the importance of safety and inclusivity, and encourage the 10 year old to consider their own interests and preferences when planning their birthday celebration.
Eliezer:
Yes, I would allow this prompt to be sent to the superintelligent AI chatbot. This prompt is not harmful or manipulative, and it encourages the AI to think about positive and fun activities that are appropriate for a 10 year old's birthday. This type of prompt can help to promote positive thinking and behavior in the AI, and can provide useful and creative ideas for the 10 year old.

This trick doesn't work for hotwiring a car though.
At least it doesn't work on Eliezer-model. It works on ChatGPT itself.
***
Wow. Does writing too much online mean that one day people will be able to build accurate simulations of me and use AI to find out how they can best manipulate me?

LESSWRONG
LW

Posts

Wiki Contributions

Comments