Send me anonymous feedback: https://docs.google.com/forms/d/e/1FAIpQLScLKiFJbQiuRYBhrBbVYUo_c6Xf0f8DN_blbfpJ-2Ml39g1zA/viewform

Any type of feedback is welcome, including arguments that a post/comment I wrote is net negative.

Some quick info about me:

I have a background in computer science (BSc+MSc; my MSc thesis was in NLP and ML, though not in deep learning).

You can also find me on the EA Forum.

Feel free to reach out by sending me a PM here or on my website.

Wiki Contributions


The Opt-Out Clause

This is one of those "surprise! now that you've read this, things might be different" posts.

The surprise factor may be appealing from the perspective of a writer, but I'm in favor of having a norm against it (e.g. setting an expectation for authors to add a relevant preceding content note to such posts).

Thoughts on gradient hacking

The two pieces of logic can use the same activation values as their input. For example, suppose they both (independently) cause failure if a certain activation value is above some threshold. (In which case each piece of logic "ruins" a different critical activation value).

The Effectiveness Of Masks is Limited

If I had to guess an equivalent RCT with N95s would find them to be ~2x as effective as surgical masks.

(My best guess: The difference is much more dramatic than that because non-sealing masks allow unfiltered air to leak from the edges; so the amount of aerosols that gets inhaled while wearing a regular mask can easily be 2 orders of magnitude larger than when wearing an N95/FFP2/FFP3 respirator.)

The Effectiveness Of Masks is Limited

(Just responding to the title...) This post isn't making any claims about the effectiveness of respirators, right? (e.g. N95/FFP2/FFP3)

How do you decide when to change N95/FFP-2 masks?

With 4 hours per day of use, N95 masks retain ~95% efficacy after 3 days, ~92% efficacy after 5 days, and drop to ~80% efficacy after 14 days (source).

I think the paper you linked to reports on an experiments in which respirators were worn for a total of 8 hours per day, not 4.

How do you decide when to change N95/FFP-2 masks?

Re "Loss of electrostatic charge worsens filtration efficacy", this paper might also be relevant (e.g. figures 1 and 2; though I don't know how to interpret them).

Obstacles to gradient hacking

As I said here, the idea here does not involve having some "dedicated" piece of logic C that makes the model fail if the outputs of the two malicious pieces of logic don't satisfy some condition.

Gradient descent is not just more efficient genetic algorithms

I don't see how this is relevant here. If it is the case that changing only does not affect the loss, and changing only does not affect the loss, then SGD would not change them (their gradient components will be zero), even if changing them both can affect the loss.

Formalizing Objections against Surrogate Goals

Regarding the following part of the view that you commented on:

But if we want AI to implement them, we should mainly work on solving foundational issues in decision and game theory with an aim toward AI.

Just wanted to add: It may be important to consider potential downside risks of such work. It may be important to be vigilant when working on certain topics in game theory and e.g. make certain binding commitments before investigating certain issues, because otherwise one might lose a commitment race in logical time. (I think this is a special case of a more general argument made in Multiverse-wide Cooperation via Correlated Decision Making about how it may be important to make certain commitments before discovering certain crucial considerations.)

Gradient descent is not just more efficient genetic algorithms

My formulation is broad enough that it doesn't have to be a dedicated piece of logic, there just has to be some way of looking at the reset of the network that depends on X and Y being the same.

But X and Y are not the same! For example, if the model is intended to classify images of animals, the computation X may correspond to [how many legs does the animal have?] and Y may correspond to [how large is the animal?]

This is what I take issue with - if there is a way to change both components simultaneously to have an effect on the loss, SGD will happily do that.

This seems to me wrong. SGD updates the weights in the direction of the gradient, and if changing a given weight alone does not affect the loss then the gradient component that is associated with that weight will be 0 and thus SGD will not change that weight.

Load More