Jakub Kraus

Running an AI safety group at the University of Michigan. https://maisi.club/

Email: jakraus@umich.edu

Anonymous feedback: https://www.admonymous.co/jakraus

Wiki Contributions


Imagine you are the CEO of OpenAI, and your team has finished building a new, state-of-the-art AI model. You can:

  1. Test the limits of its power in a controlled environment.
  2. Deploy it without such testing.

Do you think (1) is riskier than (2)? I think the answer depends heavily on the details of the test.

On the other hand, in your view all deep learning progress has been empirical, often via dumb hacks and intuitions (this isn't true imo). 

Can you elaborate on why you think this is false? I'm curious.

On a related note, this part might be misleading:

I’m just really, really skeptical that a bunch of abstract work on decision theory and similar [from MIRI and similar independent researchers] will get us there. My expectation is that alignment is an ML problem, and you can’t solve alignment utterly disconnected from actual ML systems.

I think earlier forms of this research focused on developing new, alignable algorithms, rather than aligning existing deep learning algorithms. However, a reader of the first quote might think "wow, those people actually thought galaxy-brained decision theory stuff was going to work on deep learning systems!"

For more details, see Paul Christiano's 2019 talk on "Current work in AI alignment":

So for example, I might have a view like: we could either build AI by having systems which perform inference and models that we understand that have like interpretable beliefs about the world and then act on those beliefs, or I could build systems by having opaque black boxes and doing optimization over those black boxes. I might believe that the first kind of AI is easier to align, so one way that I could make the alignment tax smaller is just by advancing that kind of AI, which I expect to be easier to align.

This is not a super uncommon view amongst academics. It also may be familiar here because I would say it describes MIRI's view; they sort of take the outlook that some kinds of AI just look hard to align. We want to build an understanding such that we can build the kind of AI that is easier to align.

In my opinion it was the right call to spend this amount of funding on the office for the last ~6 months of its existence even when we thought we'd likely do something quite different afterwards

This is confusing to me. Why not do "something quite different" from the start?

I'm trying to point at opportunity costs more than "gee, that's a lot of money, the outcome had better be good!" There are many other uses for that money besides the Lightcone offices.

A smart, competent, charismatic, person with horrible ethics will enter the office because they've managed to get good standing in the EA/longtermist ecosystem

My current understanding is that Sam gained good standing as a result of having lots of money for EA causes, not as a result of being charismatic in EA spaces? My sense is that the person you mentioned would struggle to gain good standing in the Lightcone offices without any preexisting money or other power. 

I'm confused. Here's a conversation I just had with GPT-4, with prompts in italics:

a, b, c, d, e, f, g. What is the third item in the list?

The third item in the list is c.

What is the third word of this sentence?

To accurately identify the third word of the sentence, I need the sentence you are referring to. Please provide the sentence in question.

The sentence is "What is the third word of this sentence?" See what I did there?

Yes, I see what you did there. The third word of the sentence "What is the third word of this sentence?" is "the".

What is the third letter of the third word of this sentence.

The third word of the sentence "What is the third letter of the third word of this sentence." is "third". The third letter of the word "third" is i.

Now, what is the second word of THIS sentence?

The second word of the sentence "Now, what is the second word of THIS sentence?" is "what".

I pasted the YouTube video link into AssemblyAI's Playground (which I think uses Conformer-1 for speech to text) and generated a transcript, available at this link. However, the transcript lacks labels for who is speaking.

I asked GPT-4 to summarize the article and then come up with some alternative terms, here are a few I like:

  • One-way summary
  • Insider mnemonic
  • Contextual shorthand
  • Familiarity trigger
  • Conceptual hint
  • Clue for the familiar
  • Knowledge spark
  • Abbreviated insight
  • Expert's echo
  • Breadcrumb for the well-versed
  • Whisper of the well-acquainted
  • Insider's underexplained aphorism

I also asked for some idioms. "Seeing the forest but not the trees" seems apt.

Brain computation speed is constrained by upper neuron firing rates of around 1 khz and axon propagation velocity of up to 100 m/s [43], which are both about a million times slower than current computer clock rates of near 1 Ghz and wire propagation velocity at roughly half the speed of light.

Can you provide some citations for these claims? At the moment the only citation is a link to a Wikipedia article about nerve conduction velocity.

Transistors can fire about 10 million times faster than human brain cells

Does anyone have a citation for this claim?

The post title seems misleading to me. First, the outputs here seem pretty benign compared to some of the Bing Chat failures. Second, do all of these exploits work on GPT-4?

Load More