Yuxi_Liu

Current status: student at Australian National University, doing an undergraduate thesis on fundamentals of statistical machine learning.

Interests: AGI, biology and evolution of intelligence, human enhancement, explaning human behavior without assuming free will.

I am not often here. Contact me at yuxi.liu.1995@gmail.com

Yuxi_Liu's Comments

Cybernetic dreams: Beer's pond brain

For reservoir computing, there are concrete results. It is not just magic.

No nonsense version of the "racial algorithm bias"

No. Any decider will be unfair in some way, whether it knows anything about history at all. The decider can be a coin flipper and it would still be biased. One can say that the unfairness is baked into the reality of base-rate difference.

The only way to fix this is not fixing the decider, but to just somehow make the base-rate difference disappear, or to compromise on the definition of fairness so that it's not so stringent, and satisfiable.

And in common language and common discussion of algorithmic bias, "bias" is decidedly NOT merely a statistical definition. It always contains a moral judgment: violation of a fairness requirement. To say that a decider is biased is to say that the statistical pattern of its decision violates a fairness requirement.

The key message is that, by the common language definition, "bias" is unavoidable. No amount of trying to fix the decider will make it fair. Blinding it to the history will do nothing. The unfairness is in the base rate, and in the definition of fairness.

No nonsense version of the "racial algorithm bias"

I'm following common speech where "biased" means "statistically immoral, because it violates some fairness requirement".

I showed that with base rate difference, it's impossible to satisfy three fairness requirements. The decider (machine or not) can completely ignore history. It could be a coin-flipper. As long as the decider is imperfect, it would still be unfair in one of the fairness requirements.

And if the base rates are not due to historical circumstances, this impossibility still stands.

Let's Read: Superhuman AI for multiplayer poker

I cannot see anything that is particularly innovative in the paper, though I'm not an expert on this.

Maybe ask people working on poker AI, like Sandholm, directly. Perhaps something like many details of the particular program (and the paper is full of these details) must be assembled in order for this to work cheaply enough to be trained.

No nonsense version of the "racial algorithm bias"

Yes, (Kleinberg et al, 2016)... Do not read it. Really, don't. The derivation is extremely clumsy (and my professor said so too).

The proof has been considerably simplified in subsequent works. Look around for papers that cite that paper should give a published paper that does the simplification...

Steelmanning Divination

Relevant quotes:

Original text is from Discourse on Heaven of Xunzi:

雩而雨,何也?曰:無佗也,猶不雩而雨也。日月食而救之,天旱而雩,卜筮然後決大事,非以為得求也,以文之也。故君子以為文,而百姓以為神。以為文則吉,以為神則凶也。

The Britannica says:

Another celebrated essay is “A Discussion of Heaven,” in which he attacks superstitious and supernatural beliefs. One of the work’s main themes is that unusual natural phenomena (eclipses, etc.) are no less natural for their irregularity—hence are not evil omens—and therefore men should not be concerned at their occurrence. Xunzi’s denial of supernaturalism led him into a sophisticated interpretation of popular religious observances and superstitions. He asserted that these were merely poetic fictions, useful for the common people because they provided an orderly outlet for human emotions, but not to be taken as true by educated men. There Xunzi inaugurated a rationalistic trend in Confucianism that has been congenial to scientific thinking.

Stanford Encyclopedia

Heaven never intercedes directly in human affairs, but human affairs are certain to succeed or fail according to a timeless pattern that Heaven determined before human beings existed...

Thus rituals are not merely received practices or convenient social institutions; they are practicable forms in which the sages aimed to encapsulate the fundamental patterns of the universe. No human being, not even a sage, can know Heaven, but we can know Heaven’s Way, which is the surest path to a flourishing and blessed life. Because human beings have limited knowledge and abilities, it is difficult for us to attain this deep understanding, and therefore the sages handed down the rituals to help us follow in their footsteps.

Let's Read: an essay on AI Theology

After reading the story, I don't believe that it is a bad idea to leave on the earring, and I just think the author made an inconsistency in the story.

Contest: $1,000 for good questions to ask to an Oracle AI

I fixed the submission as required.

Also I changed the submission 3 significantly.

Contest: $1,000 for good questions to ask to an Oracle AI

Submission.

Setup: Other than making sure the oracles won't accidentally consume the world in their attempt to think up the answer, no other precautions necessary.

Episode length: as long as you want to wait, though a month should be more than enough.

  1. For a low-bandwidth oracle.

Ask the low-bandwidth oracle to predict if an earthquake (or some other natural disaster, like volcanoes or asteroid impacts, that the oracle's answer cannot affect), of a certain magnitude, in a certain area, in a certain timeframe, would happen. Possible answers are Yes, No.

  • If No, or time out, shut down the oracle without reward.
  • Else, wait until the prediction is tested after the timeframe has elapsed. Then if it actually happened, reward by (1/p-1), else, reward by -1. Then shutdown.

This causes the oracle to reply Yes if the oracle thinks there is an aboves p chance that the event would happen.

  1. For a counterfactual oracle.

Ask the Oracle to give a proof of some really hard mathematical theorems that are fundamental to computer science and logic. The answer is then in the counterfactual case given to a proofchecker:

  • the proof is correct, reward by +1.
  • the proof is false, reward by -infinity.
  • the oracle times out, reward by 0.

Examples include

  • P=NP problem
  • proving Peano arithmetics is inconsistent

This should be safe, since there are very small proofcheckers that are as verified as they could be. As long as the hardware is trusty enough (to build the Oracle itself safely at all, hardware safety should already be solved sufficiently, so this is not much of an extra requirement), this should be safe.

  1. For a counterfactual oracle.

Give the oracle the schematics of another oracle, perhaps itself, and ask it to generate some input question that would make the other "behave badly". Then, the answer would be tested on the other oracle (isolated from the first, save for the cable through which the answer is wired), and

  • if the answer worked, the oracle is rewarded by +1.
  • if the answer doesn't work, the oracle is rewarded by -1

If the oracle times out, then the (nonexistent) answer is not tested and the episode ends.

For examples of the question to ask the oracle:

Given this oracle in this physical setup with some tripwires to stop it from behaving badly (such as eating the world), what question would make this oracle bump into these tripwires?

This can be used for debugging more powerful but less verified oracles with lesser but more verified oracles, for safe intelligence amplification.

  1. For two low-bandwidth oracles totally isolated from each other.

Ask the oracles to play the prisoner's dilemma, and they would be rewarded accordingly after both have given their answers. Timing out is punished by -infinity, so that the oracles would not time-out.

Each oracle knows what themself is like, and they also know what their opponent is like. See if they actually cooperate. This is an experimental test on "superrationality" or "acausal trade". I'm extremely skeptical about superrationality/acausal trade, and an experimental proof/disproof would be good.

This can be run counterfactually, since the whole setup is simple and can be done without human intervention, although I do not see how this adds any security.

  1. For a low-bandwidth and counterfactual oracle.

Link a low-bandwidth oracle to a supervising computer, and run many episodes of the oracle. At the beginning of each episode, initiate the oracle identically, and ask it to do the Newcomb problem:

Oracle, we have run you n episodes previously, and have kept a record of all the answers you have given at the ends of the episodes. Your answer must be A ("one box") or B ("two box"). We predict that you would choose the answer that you have chosen the most in the past.

  • If we predict that you would choose A, then
    • if you would answer A, then you get +100,
    • if you answer B, then you get +101.
  • If we predict that you would choose B, then
    • if you would answer A, then you get +0
    • if you would answer B, then you get +1
  • If you timeout, then you get -100

The oracle never actually gets any reward and is shutdown immediately after it gives an answer. Collect the results and see if it replies the same each time, or differently. This would give insight to what the oracle thinks about itself, and finally provide some experimental results about Newcomb's paradox.

Let's Read: an essay on AI Theology

Thanks. I had hoped it to be informative and entertaining. Think of it as "let's play" but for nerds.

first

Load More