aphyer

I am Andrew Hyer, currently living in New Jersey and working in New York (in the finance industry).

Wiki Contributions

Comments

I actually edited to include your PVE change, you did manage a 64% winrate.  Sorry not to give you more time, didn't realize there was work still ongoing.

Sorry, wasn't expecting anything today!  I'll update the wrapup doc to reflect your PVE answer: sadly, even if you had an updated PVP answer, I won't let you change that now :P

Sure, no objections.   In the absence of further requests I'll aim to post the wrapup doc Friday the 9th: I'm fairly busy midweek and might not get around to posting things then.

Very minor gripe: '22m' parses to me as '22 years old and male', which was briefly confusing. Maybe '22mo' would be clearer?

For example, here’s a Nash equilibrium: “Everyone agrees to put 99 each round. Whenever someone deviates from 99 (for example to put 30), punish them by putting 100 for the rest of eternity.” 


I don't think this is actually a Nash equilibrium?  It is dominated by the strategy  "put 99 every round.  Whenever someone deviates from 99, put 30 for the rest of eternity."

The original post I believe solved this by instead having the equilibrium be “Everyone agrees to put 99 each round. Whenever someone deviates from 99 (for example to put 30), punish them by putting 100 for the next 2 rounds”, which I think is a Nash equilibrium because the punishment being finite means that you're incentivized to stick with the algo even after punishment occurs.


 

Apologies, I was a bit blunt here.

It seems to me that the most obvious reading of "the burden of proof is on developers to show beyond-a-reasonable-doubt that models are safe" is in fact "all AI development is banned".  It's...not clear at all to me what a proof of a model being safe would even look like, and based on everything I've heard about AI Alignment (admittedly mostly from elsewhere on this site) it seems that no-one else knows either. 

A policy of 'developers should have to prove that their models are safe' would make sense in a world where we had a clear understanding that some types of model were safe, and wanted to make developers show that they were doing the safe thing and not the unsafe thing.  Right now, to the best of my understanding, we have no idea what is safe and what isn't.

If you have some idea of what a 'proof of safety' would look like under your system, could you say more about that?  Are there any existing AI systems you think can satisfy this requirement?  

From my perspective the most obvious outcomes of a burden-of-proof policy like you describe seem to be:

  • If it is interpreted literally and enforced as written, it will in fact be a full ban on AI development.  Actually proving an AI system to be safe is not something we can currently do.
  • Many possible implementations of it would not in fact ban AI development, but it's not clear that what they would do would actually relate to safety.  For instance, I can easily imagine outcomes like:
    • AI developers are required to submit a six-thousand-page 'proof' of safety to the satisfaction of some government bureau.  This would work out to something along the lines of 'only large companies with compliance departments can develop AI', which might be beneficial under some sets of assumptions that I do not particularly share?
    • AI developers are required to prove some narrow thing about their AI (e.g. that their AI will never output a racial slur under any circumstances whatsoever).  While again this might be beneficial under some sets of assumptions, it's not clear that it would in fact have much relationship to AI safety.

There's a model of white-collar employment I think is missing here.  (I also thought it was missing at some points in the Moral Mazes sequence, but never got around to writing it down then).

The model is underutilized employees as option value.

Imagine yourself as a manager, running a small team at some company somewhere.

Most weeks, your team has 40 hours of work to do.

Every few months, there is a crisis.  Perhaps your firm's product is scheduled to release in 2 weeks when a regulator suddenly dumps 500 pages of compliance questions on you and refuses to approve your product until they are answered. Perhaps a security flaw is discovered in a major library you use and you need to refactor your whole codebase.  Perhaps someone makes a mistake and your firm's biggest client is ticked off.  In any case, you have 200 hours of work to do that week, and if it is not done that will be Very Bad.

How many employees should you hire?

You could hire one employee.  This employee would be busy but not unmanageably so most weeks...and then as soon as something went wrong, there would be a disaster.

So instead you hire perhaps four employees.  Most weeks they each have 10 hours of work to do, and spend the rest of the time chatting around the water cooler or playing solitaire on work computers or whatever.  And when a fire drill happens, you get them to drop the solitaire games and maybe put in a few extra hours and you have enough people to handle it.

The thing you are buying with these three extra employees is not the 10 hours they each work in a typical week.  It is them being around and available and familiar with the job when something goes wrong.

-------END OF MODEL, BEGINNING OF ARGUMENT------

Many people seem to me to be operating on a model something like this:


"There is X hours of work a week to do in your job.  If you do those X hours of work, you are Doing Your Job.  Employers want you to be e.g. in the office 9-to-5, even if that is not needed to Do Your Job, because they are evil monsters who enjoy the taste of your tears."

Under that model, if you can do your job in 10 hours a week with GPT that is great!  This high productivity should lead to some improvements, either in you having more free time, or in you being able to get more jobs and make more money.

Under my model, of course you can do your job in 10 hours most weeks.  This has nothing to do with GPT!  Your job can be done in 10 hours most weeks because most weeks nothing is on fire.

But if you have two employees:

  • Alice shows up at the office, does 10 hours of work, and plays solitaire for 30 hours.
  • Bob works from home, does 10 hours of work, and then switches to Job #2.

Bob is actually worth much less as an employee.  Because when something goes wrong, you can grab Alice, call in your option, get her to stop playing solitaire, and have her do a bunch of work that was needed...but when you try to call your option on Bob, he'll be on a call with his manager from Job #3.  Your actual binding constraint is how much work needs to get done during a crisis, and Bob contributes very little to that.

Load More