User Profile

star8
description27
message528

Recent Posts

Curated Posts
starCurated - Recent, high quality posts selected by the LessWrong moderation team.
rss_feed Create an RSS Feed
Frontpage Posts
Posts meeting our frontpage guidelines: • interesting, insightful, useful • aim to explain, not to persuade • avoid meta discussion • relevant to people whether or not they are involved with the LessWrong community.
(includes curated content and frontpage posts)
rss_feed Create an RSS Feed
Personal Blogposts
personPersonal blogposts by LessWrong users (as well as curated and frontpage).
rss_feed Create an RSS Feed

Meetup : NLP for large scale sentiment detection

2y
Show Highlightsubdirectory_arrow_left
0

Meetup : Quantum Homeschooling

2y
Show Highlightsubdirectory_arrow_left
0

MIRIx Israel, Meeting summary

2y
Show Highlightsubdirectory_arrow_left
1

Meetup : Biases and making better decisions

3y
Show Highlightsubdirectory_arrow_left
0

Gatekeeper variation

3y
Show Highlightsubdirectory_arrow_left
8

Meetup : Logical Counterfactuals, Tel Aviv

3y
Show Highlightsubdirectory_arrow_left
0

Meetup : Tel Aviv Meetup: Assorted LW mini-talks

3y
Show Highlightsubdirectory_arrow_left
1

Meetup : Tel Aviv Meetup: Social & Board Games

3y
Show Highlightsubdirectory_arrow_left
0

Podcast: Rationalists in Tech

3y
Show Highlightsubdirectory_arrow_left
9

Recent Comments

Eliezer is still writing AI Alignment content on it, ... MIRI ... adopt Arbital ...

How does Eliezer's work on Arbital relate to MIRI? Little is publicly visible of what is is doing in MIRI. Is he focusing on Arbital? What is the strategic purpose?

Pre-existing friends, postings on Facebook (even though FB does not distribute events to the timelines of group members if there are more than 250 people in a group), occasionally lesswrong.com (not event postings, but more that people who are actively interested LW seek out a Tel Aviv group)

In Tel Aviv, we have three types of meetings, all on Tuesdays. Monthly we have a full meeting, usually a lecture or sometimes [Rump Sessions](https://www.iacr.org/conferences/crypto2004/rump.html) (informal lightning talks). Typical attendance is 12.

Monthly, alternating fortnights from the above,...(read more)

There certainly should be more orgs with different approaches. But possibly, CHCAI plays a role as the representative of MIRI in the mainstream academic world, and so from the perspective of goals, it is OK that the two are quite close.

You're quite right--these are among the standard objections for boxing, as mentioned in the post. However, AI boxing may have value as a stopgap in an early stage, so I'm wondering about the idea's value in that context.

Sure, but to "independently verify" the output of an entity smarter than you is generally impossible. This makes it possible, while also limiting the potential of the boxed AI to choose its answers.

Thanks. Those points are correct. Is there any particular weakness or strength to this UP-idea in contrast to Oracle, tool-AI, or Gatekeeper ideas?

Thank you, Kaj. Those references are what I was looking for.

It looks like there might be a somewhat new idea here. Previous suggestions, as you mention, restrict output to a single bit; or require review by human experts. Using multiple AGI oracles to check each other is a good one, though I'd...(read more)

Could someone point me to any existing articles on this variant of AI-Boxing and Oracle AGIs:

The boxed AGI's gatekeeper is a simpler system which runs formal proofs to verify that AGI's output satisfies a simple, formally definable. The constraint is not "safety" in general but rather is narr...(read more)