daozaich - LessWrong

How much does cybersecurity reduce AI risk?

Answer by daozaichJun 14, 202210

I doubt your optimism on the level of security that is realistically achievable. Don't get me wrong: The software industry has made huge progress (at large costs!) in terms of security. Where before, most stuff popped a shell if you looked at it funny, it is now a large effort for many targets.

Further progress will be made.

If we extrapolate this progress -- we will optimistically reach a point where impactful reliable 0day is out of reach for most hobbyists and criminals, and the domain of natsec of great powers.

But I don't see how raising this waterline will help for AI risk in particular?

As in: godlike superintelligence is game over anyway. AI that is comparably good at exploitation as the rest of humanity taken together, is beyond what is realistically defendable against, in terms of wide-spread deployed security level. An AI that doesn't reach that level without human assistance is probably not lethal anyways.

On the other hand, one could imagine pivotal acts by humans with limited-but-substantial AI assistance that rely on the lack of wide-spread security.

Pricing human + weakish AI collaborations out of the world-domination-via-hacking game might actually make matters worse, in so far as weakish non-independent AI might be easier to keep aligned.

A somewhat dystopian wholesale surveillance of almost every word written and said by humans, combined with AI that is good enough at text comprehension and energy efficient enough to pervasively and correctly identify scary-looking research and flag it to human operators for intervention is plausibly pivotal and alignable, and makes for much better cyberpunk novels than burning GPUs anyway (mentally paging cstross, I want my Gibson homage in form of a "Turing Police"/laundry-verse crossover).

Also, good that you mentioned rowhammer. Rowhammer and the DRAM industries half-baked pitiful response are humankinds capitulation in terms of "making at least some systems actually watertight".

Implications of automated ontology identification

daozaich3y60

The fixed point problem is worse than you think. Take the Hungarian astrology example, with an initial easy set with both a length limitation (e.g. < 100k characters) and simplicity limitation.

Now I propose a very simple improvement scheme: If the article ends in a whitespace character, then try to classify the shortened article with last character removed.

This gives you an infinite sequence of better and better decision boundaries (each time, a couple of new cases are solved -- the ones that are of lenth 100k + $N$, end in at least $N$ whitespace, and are in the easy set once the whitespace has been stripped). This nicely converges to the classifier that trims all trailing whitespace and then asks its initial classifier.

What I'm trying to say here is: The space of cases to consider can be large in many dimensions. The countable limit of a sequence of extensions needs not be a fixed point of the magical improvement oracle.

Generally, I'd go into a different direction: Instead of arguing about iterated improvement, argue that of course you cannot correctly extrapolate all decision problems from a limited amount of labeled easy cases and limited context. The style of counter-example is to construct two settings ("models" in the lingo of logic) A and B with same labeled easy set (and context made available to the classifier), where the correct answer for some datapoint x differs in both settings. Hence, safe extrapolation must always conservatively answer NO to x, and cannot be expected to answer all queries correctly from limited training data (typical YES / NO / MAYBE split).

I think the discussion about the fixed point or limit iterative improvement does not lead to the actually relevant argument that extrapolation cannot conjure information out of nowhere?

You could cut it out completely without weakening the argument against certain types of automated ontology identification being impossible.

Math: Textbooks and the DTP pipeline

daozaich6y10

The Definition-Theorem-Proof style is just a way of compressing communication. In reality, heuristic / proof-outline comes first; then, you do some work to fill the technical gaps and match to the existing canon, in order to improve readability and conform to academic standards.

Imho, this is also the proper way of reading maths papers / books: Zoom in on the meat. Once you understood the core argument, it is often unnecessary too read definitions or theorems at all (Definition: Whatever is needed for the core argument to work. Theorem: Whatever the core argument shows). Due to the perennial mismatch between historic definitions and theorems and the specific core arguments this also leaves you with stronger results than are stated in the paper / book, which is quite important: You are standing on the shoulders of giants, but the giants could not foresee where you want to go.

RFC: Mental phenomena in AGI alignment

daozaich6y30

This paints a bleak picture for the possibility of aligning mindless AGI since behavioral methods of alignment are likely to result in divergence from human values and algorithmic methods are too complex for us to succeed at implementing.

To me it appears like the terms cancel out: Assuming we are able to overcome the difficulties of more symbolic AI design, the prospect of aligning such an AI seem less hard.

In other words, the main risk is wasting effort on alignment strategies that turn out to be mismatched to the eventually implemented AI.

What will we do with the free energy?

daozaich6y10

The negative prices are a failure of the market / regulation, they don't actually mean that you have free energy.

That being said, the question for the most economical opportunistic use of intermittent energy makes sense.

Why it took so long to do the Fermi calculation right?

daozaich6y190

No. It boils down to the following fact: If you take given estimates on the distribution of parameter values at face value, then:

(1) The expected number of observable alien civilizations is medium-large (2) If you consider the distribution of the number of alien civs, you get a large probability of zero, and a small probability of "very very many aliens", that integrates up to the medium-large expectation value.

Previous discussions computed (1) and falsely observed a conflict with astronomical observations, and totally failed to compute (2) from their own input data. This is unquestionably an embarrassing failure of the field.

Logical uncertainty and Mathematical uncertainty

daozaich6y30

What is logical induction's take on probabilistic algorithms? That should be the easiest test-case.

Say, before "PRIME is in P", we had perfectly fine probabilistic algorithms for checking primality. A good theory of mathematical logic with uncertainty should permit us to use such an algorithm, without random oracle, for things you place as "logical uncertainty". As far as I understood, the typical mathematician's take is to just ignore this foundational issue and do what's right (channeling Thurston: Mathematicians are in the business of producing human understanding, not formal proofs).

Monty Hall in the Wild

daozaich6y110

It’s excellent news! Your boss is a lot more likely to complain about some minor detail if you’re doing great on everything else, like actually getting the work done with your team.

Unfortunately this way of thinking has a huge, giant failure mode: It allows you to rationalize away critique about points you consider irrelevant, but that are important to your interlocutor. Sometimes people / institutions consider it really important that you hand in your expense sheets correctly or turn up in time for work, and finishing your project in time with brilliant results is not a replacement for "professional demeanor". This was not a cheap lesson for me; people did tell me, but I kinda shrugged it off with this kind of glib attitude.

Editor Mini-Guide

daozaich6y70

Is there a way of getting "pure markdown" (no wysiwyg at all) including Latex? Alternatively, a hotkey-less version of the editor (give me buttons/menus for all functionality)?

I'm asking because my browser (chromium) eats the hotkeys, and latex (testing: $\Sigma$ ) appears not to be parsed from markdown. I would be happy with any syntax you choose. For example \Sigma; alternatively the github classic of using backticks appears still unused here.

edit: huh, backticks are in use and html-tags gets eaten.

Beyond Astronomical Waste

daozaich6y30

Isn't all this massively dependent on how your utility $U$ scales with the total number $N$ of well-spent computations (e.g. one-bit computes)?

That is, I'm asking for a gut feeling here: What are your relative utilities for $10^{100}$, $10^{110}$, $10^{120}$, $10^{130}$ universes?

Say, $U(0)=0$, $U(10^100)=1$ (gauge fixing); instant pain-free end-of-universe is zero utility, and a successful colonization of the entire universe with a suboptimal black hole-farming near heat-death is unit utility.

Now, per definitionem, the utility $U(N)$ of a $N$-computation outcome is the inverse of the probability $p$ at which you become indifferent to the following gamble: Immediate end-of-the-world at probability $(1-p)$ vs an upgrade of computational world-size to $N$ at propability $p$.

I would personally guess that $U(10^{130})< 2 $; i.e. this upgrade would probably not be worth a 50% risk of extinction. This is massively sublinear scaling.

LESSWRONG
LW

Posts

Wiki Contributions

Comments