LESSWRONG
LW

860
Marius Adrian Nicoară
88200
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
Will Jesus Christ return in an election year?
Marius Adrian Nicoară3mo10

Thank you for taking the time to clarify things for me!

Reply1
Will Jesus Christ return in an election year?
Marius Adrian Nicoară3mo*10

It's interesting to learn how how the dollar amount being placed on the bet was turned into percentage points. I was confused about that. Thanks for making it clear!

Just to make sure that I understand: does it make sense to describe the situation in terms of what I would call a hot potato dynamic? 

In the beginning, the "No" shares are a cold potato, because most people don't believe that Jesus will actually return.

But the incentive for people to buy "Yes" shares is their expectation that the people holding the "No" shares will sell their position later for money to place on other bets. 

When the "No" shares are sold, they are sold for less because those who sell them expect a better profit the mayoral race than from the prophesy of Jesus returning.

So the people who bought "Yes" shares in the beginning can make a profit for a while - the potato is hot. People might even buy "Yes" shares because they expect more people to sell "No" shares in the future.

But at some point people will stop selling "No" shares. And those holding the "Yes" shares won't be able squeeze anymore profit - the potato has now cooled, as it were.

Reply
Will Jesus Christ return in an election year?
Marius Adrian Nicoară3mo10

If enough people want to sell their "No" shares, the "Yes" holders may be able to sell out at an elevated price

I'm not familiar with how Polymarket works, so I guess that's why I don't understand who would want to buy the shares of the "Yes" holders.  Could you please explain?

Reply
Alignment Research Field Guide
Marius Adrian Nicoară4moΩ010

It's not quite clear to me what a MIRIx chapter means. 
The only thing I can connect it to is TEDx, which are events that are planned and coordinated independently, on a community-by-community basis, under a free license from TED.
Is a MIRIx chapter similar to a TEDx event?

Reply
Absolute Zero: Alpha Zero for LLM
Marius Adrian Nicoară5mo10

"If the human wants coffee, we want the AI to get the human a coffee. We don't want the AI to get itself a coffee." 
It's not clear to me that this is the only possible outcome. It's not a mistake that we humas do routinely. In fact, there is some evidence that if someone asks us to do them a favor, we might end up liking them more and continue to do more favors for that person. Granted, there seem to have been no large-scale studies analyzing this so called Ben Franklin effect. Even if this effect does turn out to be more robust, it's not clear to me how this could transfer to an AI. And then there's the issue of making sure the AI won't somehow get rid of this constraint that we imposed on it.

"The problem is that we don't know what we want the AI to do, certainly not with enough precision to turn it into code." 
I agree; that's backed up by the findings from the Moral Machine experiment about what we think autonomous cars should do.
 

Reply
PSA: The LessWrong Feedback Service
Marius Adrian Nicoară5mo00

I would be great to have automated feedback on the epistemics of a piece of text. An LLM that can read text and identify reasoning errors or add appropriate qualifiers. As a browser plugin, it would also be helpful when reading news articles. Perhaps it can be done by using the Constitutional AI methodology and using Rationality: From A-Z(or something similar) as the constitution.

Reply
Absolute Zero: Alpha Zero for LLM
Marius Adrian Nicoară5mo10

I only skimmed a little through the post I'm linking to, but I'm curios if the method of self-other-overlap could help "keep AI meta-ethical evolution grounded to human preferences":

https://www.lesswrong.com/posts/jtqcsARGtmgogdcLT/reducing-llm-deception-at-scale-with-self-other-overlap-fine

My own high-level, vaguely defined guess of a method would be something that is central to the functioning of the AI such that if the AI goes against it, then the AI will not be able to make sense of the world. But that seems to carry the risk of the AI just messing everything up as it goes crazy. So the method should also include a way of limiting the capabilities of the AI while it's in that confused state.

Reply
Absolute Zero: Alpha Zero for LLM
Marius Adrian Nicoară5mo10

"it's the distinction between learning from human data versus learning from a reward signal." That's an interesting distinction. The difference I currently see between the two is that currently a reward signal can be hacked by the AI, while human data cannot. Is that an accurate thing to say? 

Are there any resources you could recommend for alignment methods that take into account the distinction you mentioned?

Reply
Absolute Zero: Reinforced Self-play Reasoning with Zero Data
Marius Adrian Nicoară5mo10

I think editing should be possible. Not sure about deleting it entirely.

Reply
Absolute Zero: Alpha Zero for LLM
Marius Adrian Nicoară5mo30

I think that from an AI Alignment perspective,  giving AI so much control over its training seems to be very problematic. What we are mostly left with is to control the interface that AI has to physical reality i.e. sensors and actuators.

For now, it seems to me that AI is mostly affecting the virtual world. I think the moment when AI can competently and more directly influence physical reality would be a tipping point, because then it can cause a lot more changes to the world. 

I would say that the ability to do continuous learning is required to adapt well to the complexity of physical reality. So a big improvement in continuous learning might be an important next goalpost to watch for.

Reply
Load More
2Satire: Sam Altman get's grilled by the Financial Times for his kitchen and his cooking skills + what this might say about him
5mo
0
1An artistic illustration of Scalable Oversight - "A world apart, neither gods nor mortals"
6mo
0
-3Hyppotherapy
1y
0
3Some desirable properties of automated wisdom
1y
2
7What and how much makes a difference?
1y
0