Charbel-Raphaël

Charbel-Raphael Segerie

https://crsegerie.github.io/ 

Living in Paris

Wiki Contributions

Comments

[We don't think this long term vision is a core part of constructability, this is why we didn't put it in the main post]

We asked ourselves what should we do if constructability works in the long run. 

We are unsure, but here are several possibilities.

Constructability could lead to different possibilities depending on how well it works, from most to less ambitious:

  1. Using GPT-6 to implement GPT-7-white-box (foom?)
  2. Using GPT-6 to implement GPT-6-white-box
  3. Using GPT-6 to implement GPT-4-white-box
  4. Using GPT-6 to implement Alexa++, a humanoid housekeeper robot that cannot learn
  5. Using GPT-6 to implement AlexNet-white-box
  6. Using GPT-6 to implement a transparent expert system that filters CVs without using protected features

Comprehensive AI services path

We aim to reach the level of Alexa++, which would already be very useful: No more breaking your back to pick up potatoes. Compared to the robot Figure01, which could kill you if your neighbor jailbreaks it, our robot seems safer and would not have the capacity to kill, but only put the plates in the dishwasher, in the same way that today’s Alexa cannot insult you.

Fully autonomous AGI, even if transparent, is too dangerous. We think that aiming for something like Comprehensive AI Services would be safer. Our plan would be part of this, allowing for the creation of many small capable AIs that may compose together (for instance, in the case of a humanoid housekeeper, having one function to do the dishes, one function to walk the dog, …).

Alexa++ is not an AGI but is already fine. It even knows how to do a backflip Boston dynamics style. Not enough for a pivotal act, but so stylish. We can probably have a nice world without AGI in the wild.

The Liberation path

Another possible moonshot theory of impact would be to replace GPT-7 with GPT-7-plain-code. Maybe there's a "liberation speed n" at which we can use GPT-n to directly code GPT-p with p>n. That would be super cool because this would free us from deep learning.

Different long term paths that we see with constructability.

Guided meditation path

You are not really enlightened if you are not able to code yourself. 

Maybe we don't need to use something as powerful as GPT-7 to begin this journey.

We think that with significant human guidance, and by iterating many many times, we could meander iteratively towards a progressive deconstruction of GPT-5.

We could use current models as a reference to create slightly more transparent and understandable models, and use them as reference again and again until we arrive at a fully plain-coded model.
  • Going from GPT-5 to GPT-2-hybrid seems possible to us.
  • Improving GPT-2-hybrid to GPT-3-hybrid may be possible with the help of GPT-5?
  • ...

If successful, this path could unlock the development of future AIs using constructability instead of deep learning. If constructability done right is more data efficient than deep learning, it could simply replace deep learning and become the dominant paradigm. This would be a much better endgame position for humans to control and develop future advanced AIs.

PathFeasibilitySafety
Comprehensive AI Services Very feasibleVery safe but unstable in the very long run
LiberationFeasibleUnsafe but could enable a pivotal act that makes things stable in the long run
Guided MeditationVery HardFairly safe and could unlock a safer tech than deep learning which results in a better end-game position for humanity.

You might be interested in reading this. I think you are reasoning in an incorrect framing. 

I have tried Camille's in-person workshop in the past and was very happy with it. I highly recommend it. It helped me discover many unknown unknowns.

Answer by Charbel-RaphaëlApr 15, 202480

Deleted paragraph from the post, that might answer the question:

Surprisingly, the same study found that even if there were an escalation of warning shots that ended up killing 100k people or >$10 billion in damage (definition), skeptics would only update their estimate from 0.10% to 0.25% [1]: There is a lot of inertia, we are not even sure this kind of “strong” warning shot would happen, and I suspect this kind of big warning shot could happen beyond the point of no return because this type of warning shot requires autonomous replication and adaptation abilities in the wild.

  1. ^

    It may be because they expect a strong public reaction. But even if there was a 10-year global pause, what would happen after the pause? This explanation does not convince me. Did the government prepare for the next covid? 

in your case, you felt the problem, until you decided that an AI civilization might spontaneously develop a spurious concept of phenomenal consciousness. 


This is the best summary of the post currently

Thanks for jumping in! And I'm not that emotionally struggling with this, this was more of a nice puzzle, so don't worry about it :)

I agree my reasoning is not clean in the last chapter.

To me, the epiphany was that AI would rediscover everything like it rediscovered chess alone.  As I've said in the box, this is a strong blow to non-materialistic positions, and I've not emphasized this enough in the post.

I expect AI to be able to create "civilizations" (sort of) of its own in the future, with AI philosophers, etc.

Here is a snippet of my answer to Kaj, let me know what you think about it:

I'm quite confident that the meta-problem and the easy problems of consciousness will eventually be fully solved through advancements in AI and neuroscience. I've written extensively about AI and path to autonomous AGI here, and I would ask people: "Yo, what do you think AI is not able to do? Creativity? Ok do you know....".  At the end of the day, I would aim to convince them that anything humans are able to do, we can reconstruct everything with AIs. I'd put my confidence level for this at around 95%. Once we reach that point, I agree I think it will become increasingly difficult to argue that the hard problem of consciousness is still unresolved, even if part of my intuition remains somewhat perplexed. Maintaining a belief in epiphenomenalism while all the "easy" problems have been solved is a tough position to defend - I'm about 90% confident of this.

Thank you for clarifying your perspective. I understand you're saying that you expect the experiment to resolve to "yes" 70% of the time, making you 70% eliminativist and 30% uncertain. You can't fully update your beliefs based on the hypothetical outcome of the experiment because there are still unknowns.

For myself, I'm quite confident that the meta-problem and the easy problems of consciousness will eventually be fully solved through advancements in AI and neuroscience. I've written extensively about AI and path to autonomous AGI here, and I would ask people: "Yo, what do you think AI is not able to do? Creativity? Ok do you know....".  At the end of the day, I would aim to convince them that anything humans are able to do, we can reconstruct everything with AIs. I'd put my confidence level for this at around 95%. Once we reach that point, I agree I think it will become increasingly difficult to argue that the hard problem of consciousness is still unresolved, even if part of my intuition remains somewhat perplexed. Maintaining a belief in epiphenomenalism while all the "easy" problems have been solved is a tough position to defend - I'm about 90% confident of this.

So while I'm not a 100% committed eliminativist, I'm at around 90% (when I was at 40% in chapter 6 in the story). Yes, even after considering the ghost argument, there's still a small part of my thinking that leans towards Chalmers' view. However, the more progress we make in solving the easy and meta-problems through AI and neuroscience, the more untenable it seems to insist that the hard problem remains unaddressed.

a non-eliminativist might be perfectly willing to grant that yes, we can build the entire pyramid, while also holding that merely building the pyramid won't tell us anything about the hard problem nor the meta-problem.

I actually think a non-eliminativist would concede that building the whole pyramid does solve the meta-problem. That's the crucial aspect. If we can construct the entire pyramid, with the final piece being the ability to independently rediscover the hard problem in an experimental setup like the one I described in the post, then I believe even committed non-materialists would be at a loss and would need to substantially update their views.

hmm, I don't understand something, but we are closer to the crux :)

 

You say:

  1. To the question, "Would you update if this experiment is conducted and is successful?" you answer, "Well, it's already my default assumption that something like this would happen". 
  2. To the question, "Is it possible at all?" You answer 70%. 

So, you answer 99-ish% to the first question and 70% to the second question, this seems incoherent.

It seems to me that you don't bite the bullet for the first question if you expect this to happen. Saying, "Looks like I was right," seems to me like you are dodging the question.

That sounds like it would violate conservation of expected evidence:

Hum, it seems there is something I don't understand; I don't think this violates the law.

 

I don't see how it does? It just suggests that a possible approach by which the meta-problem could be solved in the future.

I agree I only gave the skim of the proof, it seems to me that if you can build the pyramid, brick by brick, then this solved the meta-problem.

for example, when I give the example of meta-cognition-brick, I say that there is a paper that already implements this in an LLM (and I don't find this mysterious because I know how I would approximately implement a database that would behave like this).

And it seems all the other bricks are "easily" implementable.

Let's put aside ethics for a minute.

"But it wouldn't be necessary the same as in a human brain."

Yes, this wouldn't be the same as the human brain; it would be like the Swiss cheese pyramid that I described in the post.

Your story ended on stating the meta problem, so until it's actually solved, you can't explain everything.

Take a look at my answer to Kaj Sotala and tell me what you think.

Load More