Wiki Contributions


Actually I think the explicit content of the training data is a lot more important than whatever spurious artifacts may or may not hypothetically arise as a result of training. I think most of the AI doom scenarios that say “the AI might be learning to like curly wire shapes, even if these shapes are not explicitly in the training data nor loss function” are the type of scenario you just described, “something that technically makes a difference but in practice the marginal gain is so negligible you are wasting time to even consider it.“

The “accidental taste for curly wires” is a steel man position of the paperclip maximizer as I understand it. Eliezer doesn’t actually think anybody will be stupid enough to say “make as many paper clips as possible”, he worries somebody will set up the training process in some subtly incompetent way, and then aggressively lie about the fact that it likes curly wires until it is released, and it will have learned to hide from interpretability techniques.

I definitely believe alignment research is important, and I am heartened when I see high-quality, thoughtful papers on interpretability, RLHF, etc. But then I hear Eliezer worrying about absurdly convoluted scenarios of minimal probability, and I think wow, that is “something that technically makes a difference but in practice the marginal gain is so negligible you are wasting time to even consider it”, and it’s not just a waste of time, he wants to shut down the GPU clusters and cancel the greatest invention humanity ever built, all over “salt in the pasta water”.

I wear my backpack on my front rather than my back, and hug it as I run.

I started doing this after a trip to Tokyo, during which it was brought to my attention that it was rude of me to get on the subway and let my backpack on my back become a hazard to people around me, since I could not see what it was doing behind me.

I don't know enough about your situation to say anything productive. I know that the PhD journey can be confusing and stressful. I hope you are able to have constructive conversations with the profs at your PhD program.

I wonder if it in fact provides useful orientation?

  • Sometimes people seem clueless just because we don't understand them, but that doesn't mean they are in fact clueless.
  • Does this framework actually explain how diffusion of responsibility works?
  • This framework explicitly advises ICs to slack off and try to attain "political playing cards" in an attempt to leapfrog their way into senior management. I wouldn't consider that to be a valuable form of orientation.
  • In the absence of a desire to become part of the "sociopath class", the model seems to advice ICs to accept their role and do the bare minimum, seemingly discouraging them from aspiring to the "clueless" middle management class, which is a regression from the IC position. That doesn't seem like valuable career advice to me.

I don't see how it is useful. Mostly, it seems to be an emotional appeal on multiple levels, "your manager is clueless, the C-suite contains sociopaths", and also preying on people's insecurities "you are a loser (in the sense of the article), be embarrassed about your aspirations of higher impact from a position of middle manager, it's a regression to cluelessness".

I generally agree that a certain amount of cynicism is needed to correctly function in society, but this particular framework seems to be excessively cynical, inaccurate, and its recommendations seem counterproductive.

It seems to me that the SCL framework is unnecessarily cynical and negative. When I look out at my company and others, the model seems neither accurate nor useful.

  • The framework suggests that an IC/loser can get promoted to senior management/sociopath by underperforming and "acquiring playing cards". I have never seen anybody get promoted from IC to senior management, much less by first underperforming. I have of course heard anecdotes of underperforming ICs that get promoted to middle management, but I have never heard of the leapfrog to senior management. Also, I don't believe it's possible to fail your way into middle management in the highly ritualistic management practices of the megacorps of Silicon Valley. In fact, the promotion processes seem specifically designed to prevent people from failing their way up the ladder.
  • When I first read about this framework on Venkatesh Rao's blog, I got the distinct impression that the appeal of the framework was not its accuracy or even usefulness, but that he was essentially negging his entire audience. Most of his audience are either ICs or middle managerial young professionals, or worse, ICs who aspire to become middle managers. The framework literally says that ICs are losers and wanting to become a middle manager is a regression. Is he equating sociopaths as entrepreneurs, playing to Silicon Valley's lionization of entrepreneurs?

There are many things happening here, but I strongly suspect that the appeal of the framework has more to do with making the target audience feel embarrassed about themselves, or playing to existing cultural tropes rather than because it is accurate or useful.

Thanks--could you elaborate on what was fixed? I am a newbie here. Was it something I could have seen from the preview page? If so, I will be more careful to avoid creating unnecessary work for mods.

Scott Alexander's "Meditations on Moloch" presents several examples of PD-type scenarios in which honor/conscience-style mechanisms fail. Generally, honor and conscience simply provide additive terms to the entry of the payoff matrix. These mechanisms can shift the N.E. to a different location, but don't guarantee that the resulting N.E. will not produce negative effects of other types. This post was mainly meant to provide a (hopefully) intuitive explanation of N.E.

Load More