LESSWRONG
LW

boazbarak
796Ω63101060
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
Call for suggestions - AI safety course
boazbarak1d20

Thank you - although these type of questions (how can a weak agent verify the actions of a strong agent) are closely related to my background in computational complexity (e.g., interactive proofs, probabilistically checkable proofs, delegation of computation), I plan to keep the course very empirical 

Reply
Call for suggestions - AI safety course
boazbarak10d30

Yes there is a general question I want to talk about which is the gap between training, evaluation, and deployment, and the reasons why models might be:


1. Able to tell in which of these environments they are in

2. Act differently based on that

Reply
A case for courage, when speaking of AI danger
boazbarak11d50

Thank you Ben. I don’t think name calling and comparisons are helpful to a constructive debate, which I am happy to have. Happy 4th!

Reply1
A case for courage, when speaking of AI danger
boazbarak12d102

I agree with you on the categorization of 1 and 2. I think there is a reason why Godwin’s law was created once thread follow the controversy attractor to this direction they tend to be unproductive.

Reply
A case for courage, when speaking of AI danger
boazbarak12d113

I edited the original post to make the same point with less sarcasm.

I take risk from AI very seriously which is precisely why I am working in alignment at OpenAI. I am also open to talking with people having different opinions, which is why I try to follow this forum (and also preordered the book). But I do draw the line at people making Nazi comparisons.

FWIW I think radicals often hurt the causes they espouse, whether it is animal rights, climate change, or Palestine. Even if after decades the radicals are perceived to have been on “the right side of history”, their impact was often negative and it caused that to have taken longer: David Shor was famously cancelled for making this point in the context of the civil rights movement.

Reply3
A case for courage, when speaking of AI danger
boazbarak12d*6-22

I am one of those people that are supposed to be stigmatized/detterred by this action. I doubt this tactic will be effective. This thread (including the disgusting comparison to Eichmann who directed the killing of millions in the real world - not in some hypothetical future one) does not motivate me to interact with the people holding such positions. Given that much of my extended family was wiped out by the holocaust, I find these Nazi comparisons abhorrent, and would not look forward to interact with people making them whether or not they decide to boycott me.

BTW this is not some original tactic, PETA is using similar approaches for veganism. I don’t think they are very effective either.

To @So8res - I am surprised and disappointed that this Godwin’s law thread survived a moderation policy that is described as “Reign of Terror”

Reply
Call for suggestions - AI safety course
boazbarak12d11

I am getting some great links as responses to my post on X https://x.com/boazbaraktcs/status/1940780441092739351 

Reply
The best simple argument for Pausing AI?
boazbarak14d30

“Healthcare” is pretty broad - certainly some parts of it are safety critical and some are less. I am not familiar with all the applications of language models for healthcare but If you are using LLM for improving efficiency in healthcare documentation then I would not call it safety critical. If you are connecting an LLM to a robot performing surgery then I would call it safety critical.

 It’s also a question of whether AIs outputs are used without supervision. If doctors or patients ask a charbot questions, I would not call it safety critical since the AI is not autonomously making the decisions. 

Reply
The best simple argument for Pausing AI?
boazbarak14d60

I think "AI R&D" or "datacenter security" are a little too broad.

I can imagine cases where we could deploy even existing models as an extra layer for datacenter security (e.g. anomaly detection). As long as this is for adding security (not replacing humans), and we are not relying on 100% success of this model, then this can be a positive application, and certainly not one that should be "paused."

With AI R&D again the question is how you deploy it, if you are using a model in containers supervised by human employees then that's fine. If you are letting them autonomously carry out large scale training runs with little to no supervision that is a completely different matter.

At the moment, I think the right mental model is to think of current AI models as analogous to employees that have a certain skill profile (which we can measure via evals etc..) and also with some small probability could do something completely crazy. With appropriate supervision, such employees could also be useful, but you would not fully trust them with sensitive infrastructure.

As I wrote in my essay, I think the difficult point would be if we get  to the "alignment uncanny valley" - alignment is at sufficiently good level (e.g., probability of failure be small enough) so that people are actually tempted to entrust models with such sensitive tasks, but we don't have strong control of this probability to ensure we can drive it arbitrarily close to zero, and so there are risks of edge cases.

Reply
The best simple argument for Pausing AI?
boazbarak15d6-2

I am much more optimistic in getting AIs to reliably follow instructions (see https://www.lesswrong.com/posts/faAX5Buxc7cdjkXQG/machines-of-faithful-obedience )

But agree that we should not deploy systems (whether AI or not) in safety critical domains without extensive testing.


I don’t think that’s a very controversial opinion. In fact I’m not sure “pause” is the right term since I don’t think such deployment has started.

Reply
Load More
51Call for suggestions - AI safety course
13d
20
33Machines of Faithful Obedience
21d
19
91Six Thoughts on AI Safety
6mo
55
51Reflections on "Making the Atomic Bomb"
2y
7
33The shape of AGI: Cartoons and back of envelope
2y
19
38Metaphors for AI, and why I don’t like them
2y
18
38Why I am not a longtermist (May 2022)
2y
19
42The (local) unit of intelligence is FLOPs
2y
7
49GPT as an “Intelligence Forklift.”
2y
27
134AI will change the world, but won’t take it over by playing “3-dimensional chess”.
Ω
3y
Ω
97
Load More