anonymousaisafety

Posts

Sorted by New

Wiki Contributions

Comments

Contest: An Alien Message

Why do you say that Kolmogorov complexity isn't the right measure?

most uniformly sampled programs of equal KC that produce a string of equal length.

...

"typical" program with this KC.

I am worried that you might have this backwards?

Kolmogorov complexity describes the output, not the program. The output file has low Kolmogorov complexity because there exists a short computer program to describe it. 

Contest: An Alien Message

I have mixed thoughts on this.

I was delighted to see someone else put forth an challenge, and impressed with the amount of people who took it up.

I'm disappointed though that the file used a trivial encoding. When I first saw the comments suggesting it was just all doubles, I was really hoping that it wouldn't turn out to be that.

I think maybe where the disconnect is occurring is that in the original That Alien Message post, the story starts with aliens deliberately sending a message to humanity to decode, as this thread did here. It is explicitly described as such: 

From the first 96 bits, then, it becomes clear that this pattern is not an optimal, compressed encoding of anything.  The obvious thought is that the sequence is meant to convey instructions for decoding a compressed message to follow...

But when I argued against the capability of decoding binary files in the I No Longer Believe Intelligence To Be Magical thread, that argument was on a tangent -- is it possible to decode an arbitrary binary files? I specifically ruled out trivial encodings in my reasoning. I listed the features that make a file difficult to decode. A huge issue is ambiguity because in almost all binary files, the first problem is just identifying when fields start or end.

I gave examples like

  1. Camera RAW formats
  2. Compressed image formats like PNG or JPG
  3. Video codecs
  4. Any binary protocol between applications
    1. Network traffic
    2. Serialization to or from disk
    3. Data in RAM

On the other hand, an array of doubles falls much more into this bucket

data that is basically designed to be interpreted correctly, i.e. the data, even though it is in a binary format, is self-describing.

With all of the above said, the reason why I did not bother uploading an example file in the first thread is frankly because it would have taken me some number of hours to create and I didn't think there would be any interest in actually decoding it by enough people to justify the time spent. That assumption seems wrong now! It seems like people really enjoyed the challenge. I will update accordingly, and I'll likely post my example of a file later this week after I have an evening or day free to do so.

Contest: An Alien Message

https://en.wikipedia.org/wiki/Kolmogorov_complexity

The fact that the program is so short indicates that the solution is simple. A complex solution would require a much longer program to specify it.

Air Conditioner Repair

I gave this post a strong disagree.

Contest: An Alien Message

Some thoughts for people looking at this:

  • It's common for binary schemas to distinguish between headers and data. There could be a single header at the start of the file, or there could be multiple headers throughout the file with data following each header.
  • There's often checksums on the header, and sometimes on the data too. It's common for the checksums to follow the respective thing being checksummed, i.e. the last bytes of the header are a checksum, or the last bytes after the data are a checksum. 16-bit and 32-bit CRCs are common.
  • If the data represents a sequence of messages, e.g. from a sensor, there will often be a counter of some sort in the header on each message. E.g. a 1, 2, or 4-byte counter that provides ordering ("message 1", "message 2", "message N") that wraps back to 0.
Air Conditioner Test Results & Discussion

I’m not sure if your comment is disagreeing with any of this. It sounds like we’re on the same page about the fact that exact reasoning is prohibitively costly, and so you will be reasoning approximately, will often miss things, etc.

I agree. The term I've heard to describe this state is "violent agreement". 

so in practice wrong conclusions are almost always due to a combination of both "not knowing enough" and “not thinking hard enough” / “not being smart enough.”

The only thing I was trying to point out (maybe more so for everyone else reading the commentary than for you specifically) is that it is perfectly rational for an actor to "not think hard enough" about some problem and thus arrive at a wrong conclusion (or correct conclusion but for a wrong reason), because that actor has higher priority items requiring their attention, and that puts hard time constraints on how many cycles they can dedicate to lower priority items, e.g. debating AC efficiency. Rational actors will try to minimize the likelihood that they've reached a wrong conclusion, but they'll also be forced to minimize or at least not exceed some limit on allowed computation cycles, and on most problems that means the computation cost + any type of hard time constraint is going to be the actual limiting factor.

Although even that, I think that's more or less what you meant by

in some sense you’ve probably spent too long thinking about the question relative to doing something else

In engineering R&D we often do a bunch of upfront thinking at the start of a project, and the goal is to identify where we have uncertainty or risk in our proposed design. Then, rather than spend 2 more months in meetings debating back-and-forth who has done the napkin math correctly, we'll take the things we're uncertain about and design prototypes to burn down risk directly.

Conor Sullivan's Shortform

First, it only targeted Windows machines running an Microsoft SQL Server reachable via the public internet. I would not be surprised if ~70% or more theoretically reachable targets were not infected because they ran some other OS (e.g. Linux) or server software instead (e.g. MySQL). This page makes me think the market share was actually more like 15%, so 85% of servers were not impacted. By not impacted, I mean, "not actively contributing to the spread of the worm". They were however impacted by the denial-of-service caused by traffic from infected servers.

Second, the UDP port (1434) that the worm used could be trivially blocked. I have discussed network hardening in many of my posts. The easiest way to prevent yourself from getting hacked is to not let the hacker send traffic to you -- blocking IP ranges, ports, unneeded Ethernet or IP protocols, and other options available in both network hardware (routers) or software firewalls provides a low cost and highly effective way to do so. This contained the denial-of-service.

Third, the worm's attack only persisted in RAM, so the only thing a host had to do was restart the infected application. Combined with the second point, this would prevent the machine from being reinfected.

This graph[1] shows the result of wide-spread adoption of filter rules within hours of the attack being detected

  1. ^
Air Conditioner Test Results & Discussion

This was actually a kind of fun test case for a priori reasoning. I think that I should have been able to notice the consideration denkenbgerger raised, but I didn't think of it. In fact when I stared reading his comment my immediate reaction was "this methodology is so simple, how could the equilibrium infiltration rate end up being relevant?" My guess would be that my a priori reasoning about AI is wrong in tons of similar ways even in "simple" cases. (Though obviously the whole complexity scale is shifted up a lot, since I've spent hundreds of hours thinking about key questions.)

This idea -- that you should have been able to notice the issue with infiltration rates -- is what I've been questioning when I ask "what is the computational complexity of general intelligence" or "what does rational decision making look like in a world with computational costs for reasoning".

There is a mindset that people are simply not rational enough, and if they were more rational, they wouldn't fall to those traps. Instead, they would more accurately model the situation, correctly anticipate what will and won't matter, and arrive at the right answer, just by exercising more careful, diligent thought.

My hypothesis is that whatever that optimal "general intelligence" algorithm[1] is -- the one where you reason a priori from first principles, and then you exhaustively check all of your assumptions for which one might be wrong, and then you recursively use that checking to re-reason from first principles -- it is computational inefficient enough in such a way that for most interesting[2] problems, it is not realistic to assume that it can run to completion in any reasonable[3] time with realistic computation resources, e.g. a human brain, or a supercomputer.[4] 

I suspect that the human brain is implementing some type of randomized vaguely-Monte-Carlo-like algorithm when reasoning, which is how people can (1) often solve problems in a reasonable amount of time[5], (2) often miss factors during a priori reasoning but understand them easily after they've seen it confirmed experimentally, (3) different people miss different things, (4) often if someone continues to think about a problem for an arbitrarily long people of time[6] they will continue to generate insights, and (5) often those insights generated from thinking about a problem for an arbitrarily long period of time are only loosely correlated[7]

In that world, while it is true that you should have been able to notice the problem, there is no guarantee on how much time it would have taken you to do so.

  1. ^

    The "God algorithm" for reasoning, to use a term that Jeff Atwood wrote about in this blog post. It describes the idea of an optimal algorithm that isn't possible to actually use, but the value of thinking about that algorithm is that it gives you a target to aim towards.

  2. ^

    The use of the word "interesting" is intended to describe the nature of problems in the real world, which require institutional knowledge, or context-dependent reasoning.

  3. ^

    The  use of the word "reasonable" is intended to describe the fact that if a building is on fire and you are inside of it, you need to calculate the optimal route out of that burning building in a time period that is than a few minutes in length in order to maximize your chance of survival. Likewise, if you are tasked to solve a problem at work, you have somewhere between weeks and months to show progress or be moved to a separate problem. For proving a theorem, it might be reasonable to spend 10+ years on it if there's nothing necessitating a more immediate solution.

  4. ^

    This is mostly based on an observation that for any scenario with say some fixed number of "obvious" factors influencing it, there are effectively arbitrarily many "other" factors that may influence the scenario, and the process of deterministically ordering an arbitrarily long list and then preceding down the list from "most likely to impact the situation" and "least likely to impact the scenario" to manually check if each "other" factor actually does matter has an arbitrarily high computational cost.

  5. ^

    Feel free to put "solve" in quotes and read this as "halt in a reasonable time" instead. Getting the correct answer is optional.

  6. ^

    Like mathematical proofs, or the thing where people take a walk and suddenly realize the answer to a question they've been considering.

  7. ^

    It's like the algorithm jumped from one part of solution space where it was stuck to a random, new part of the solution space and that's where it made progress.

Let's See You Write That Corrigibility Tag

I deliberately tried to focus on "external" safety features because I assumed everyone else was going to follow the task-as-directed and give a list of "internal" safety features. I figured that I would just wait until I could signal-boost my preferred list of "internal" safety features, and I'm happy to do so now -- I think Lauro Langosco's list here is excellent and captures my own intuition for what I'd expect from a minimally useful AGI, and that list does so in probably a clearer / easier to read manner than what I would have written. It's very similar to some of the other highly upvoted lists, but I prefer it because it explicitly mentions various ways to avoid weird maximization pitfalls, like that the AGI should be allowed to fail at completing a task.

Load More