All of ryan_b's Comments + Replies

A few years after the fact: I suggested Airborne Contagion and Air Hygiene for Stripe’s (reprint program)[].

One measure of status is how far outside the field of accomplishment it extends. Using American public education as the standard, Leibniz is only known for calculus.

there is not any action that any living organism, much less humans, take without a specific goal

Ah, here is the crux for me. Consider these cases:

  • Compulsive behavior: it is relatively common for people to take actions without understanding why, and for people with OCD this even extends to actions that contradict their specific goals.
  • Rationalizing: virtually all people actively lie to themselves about what their goals are when they take an action, especially in response to prodding about the details of those goals after the fact.
  • Internal Family Systems and
... (read more)
1Anders Lindström24d
Thank you ryan_b for expanding on your thoughts, I have been under the weather for a week, I meant to answer you earlier. To me having a goal and knowing why I have that goal are two separate things and a goal does not become less of a goal because you do not know the origin of it. Perhaps goals are a hierarchy. We all* have some default goals like eat, survive and reproduce. On top of those we can add goals invented buy ourselves or others. In the case you are without a goal, I believe you still have goals defined by your biology. Every action or inaction is due to a goal. Why do you eat? are you hungry? Bored? Tired? Compulsion? Want to gain weight? Want to loose weight? There is always a goal. Take people with OCD. In what way are those persons contradicting any goals by doing OCD stuff, like checking if the stow is off 157 times before leaving the house so they missed work? Yes, the goal of getting to work was missed, but the MORE important goal of not accidentally burning down the house and killing 35 neighbors and being the disgrace of the neighborhood was effectively achieved. So its not that fiddling with the stow was with out a goal canceling out the "real" goal of getting to work for a none goal. They were just of different importance. If I may comment on you sex qua sex analogy. I am convinced that the sex act involved a social interaction where you wanted the other person(s) to behave in a specific way to make the act of sex as enjoyable as possible (what ever that my mean). The act of sex did not happen in a vacuum. You or the other person(s) wanted to have it, no matter what the goal was. And you or the other person(s) had to manipulate the other(s) to achieve what ever goal there was to the sex.  Yes, I agree that we need coordination with other people to achieve things, and that they my be benign. But to me there is no distinction between benign or malevolent attempts to persuade or influence someone. They are both acts of manipulation. Either y

A sports analogy is Moneyball.

The counterfactual impact of a researcher is analogous to the insight that professional baseball players are largely interchangeable because they are all already selected from the extreme tail of baseball playing ability, which is to say the counterfactual impact of a given player added to the team is also low.

Of course in Moneyball they used this to get good-enough talent within budget, which is not the same as the researcher case.  All of fantasy sports is exactly a giant counterfactual exercise; I wonder how far we could get with 'fantasy labs' or something.

One way to identify counterfactually-excellent researchers would be to compare the magnitude of their "greatest achievement" and secondary discoveries, because the credit that parade leaders get is often useful for propagating their future success and the people who do more with that boost are the ones who should be given extra credit for originality (their idea) as opposed to novelty (their idea first). Newton and Leibniz both had remarkably successful and diverse achievements, which suggests that they were relatively high in counterfactual impact in most (if not all) of those fields. Another approach would consider how many people or approaches to a problem had tried and failed to solve it: crediting the zeitgeist rather than Newton and/or Leibniz specifically seems to miss a critical question, namely that if neither of them solved it, would it have taken an additional year, or more like 10 to 50? In their case, we have a proxy to an answer: ideas took months or years to spread at all beyond the "centers of discovery" at the time, and so although they clearly took only a few months or years to compete for the prize of first (and a few decades to argue over it), we can relatively safely conjecture that whichever anonymous contender is third in the running is likely to have been behind on at least that timescale. That should be considered in contrast to Andrew Wiles, whose proof of Rermat's Last Theorem was efficiently and immediately published (and patched as needed) This is also important because other and in particular later luminaries of the field (e.g. Mengoli, Mercator, various Bernoullis, Euler, etc.) might not have had the vocabulary necessary to make as many discoveries as quickly as they did or communicate those discoveries as effectively if not for Newton & Leibniz's timely contributions.

I agree that processor clock speeds are not what we should measure when comparing the speed of human and AI thoughts. That being said, I have a proposal for the significance the fact that the smallest operation for a CPU/GPU is much faster than the smallest operation for the brain.

The crux of my belief is that having faster fundamental operations means you can get to the same goal using a worse algorithm in the same amount of wall-clock time. That is to say, if the difference between the CPU and neuron is ~10x, then the CPU can achieve human performance us... (read more)

It isn't quoted in the above selection of text, but I think this quote from same chapter addresses your concern:

“I instantly saw something I admired no end. So while he was weighing my envelope, I remarked with enthusiasm: "I certainly wish I had your head of hair." He looked up, half-startled, his face beaming with smiles. "Well, it isn't as good as it used to be," he said modestly. I assured him that although it might have lost some of its pristine glory, nevertheless it was still magnificent. He was immensely pleased. We carried on a pleasant little con

... (read more)
3Anders Lindström1mo
Thank you ryan_b for your comment, I do not agree. I don't believe that there is not any action that any living organism, much less humans, take without a specific goal. When people say that they "just want to spread some selfless love in this grim world without asking for anything in return", they have a goal nontheless.  I cannot of course say exactly what kind of goal they have, but for the sake of simplicity say that selflesslovespreader A wants to make other people feel good to feel good about making other people feel good. So how does Selflesslovespread A know that the goal have been achieved in that interaction?  Well, is it that far fetched to assume that a smile or a thank you from the person that the selfless love was directed at is a good measure of the success? I.e. Selflessloverspread A have manipulated the person to respond with a certain behavior that made it possible for Selflesslovespreder A to reach the goal of feeling good about making other people feel good. I believe there is a self serving motif behind every so called selfless act. This does not make the act less good or noble, but the act serve as a mean for that person to reach a goal, what ever that goal is.  Can a human perform any type of action without a goal, no matter how small or insignificant?

Out of curiosity, what makes this chapter seem Dark-Artsy to you?

3Anders Lindström1mo
I guess the question we need to ask ourselves is if all human interactions are about manipulation or not? To me it seems that using words like persuade/inspire/motivate/stimulate etc is just the politically correct way of saying what it actual is, which is manipulation.  Manipulation is not bad in it self, it can be life saving in the instance of talking someone down from a ledge. The perceived dark-artsy part in manipulation arises when the person who was manipulated into doing something, realizes that that is not what the person wants. Manipulation is what people say when for instance inspiration has not given them the results THEY wanted. You never hear a person saying "oh, I was inspired to give $10k to a scammer online", but you hear people say "Oh, I was inspired by a friend to quite my day job and become a writer". Both were manipulated, but one still believes it is to their advantage to do what the manipulator made them do. If not all human interactions are about manipulation, what kind of interaction would that be and how would it play out?

So the smarter one made rapid progress in novel (to them) environments, then revealed they were unaligned, and then the first round of well established alignment strategies caused them to employ deceptive alignment strategies, you say.


I don't see this distinction as mattering much: how many ASI paths are there which somehow never go through human-level AGI? On the flip side, every human-level AGI is an ASI risk.

The question of whether a human level AGI safety plan is workable is separate from the question of presence of ASI risk. Many AGI safety plans, not being impossibly watertight, rely on the AGI not being superintelligent, hence the distinction is crucial for the purpose of considering such plans. There is also some skepticism of it being possible to suddenly get an ASI, in which case the assumption of AGIs being approximately human level becomes implicit without getting imposed by necessity. The plans for dealing with ASI risk are separate, they go through the successful building of safe human level AGIs, which are supposed to be the keystone of solving the rest of the problems in the nick of time (or gradually, for people who don't expect fast emergence of superintelligence after AGI). The ASI risk then concerns reliability of the second kind of plans, of employing safe human level AGIs, rather than the first kind of plans, of building them.

I would perhaps urge Tyler Cowen to consider raising certain other theories of sudden leaps in status, then? To actually reason out what would be the consequences of such technological advancements, to ask what happens?


At a guess, people resist doing this because predictions about technology are already very difficult, and doing lots of them at once would be very very difficult.

But would it be possible to treat increasing AI capabilities as an increase in model or Knightian uncertainty? It feels like questions of the form "what happens to investment ... (read more)

I'm inclined to agree with your skepticism. Lately I attribute the low value of the information to the fact that the organization is the one that generates it in the first place. In practical terms the performance of the project, campaign, etc. will still be driven by the internal incentives for doing the work, and it is not remotely incompatible for bad incentives to go unchanged leading to consistently failing projects that are correctly predicted to consistently fail. In process terms, it's a bit like what's happening with AI art when it consumes too much AI art in training.

3Sinclair Chen3mo
Yes, and firms already experiment with different economic mechanisms to produce this self-generated information - this is just compensation and employee benefits, including stock options, commissions, bonuses. In this frame, it's seems like a bad idea to let employees bet against, like projects shipping on time. A negative stake is the least aligned form of compensation possible. There are hacks on top of a pure prediction market you could do to prevent people from having a negative stake. But I think once you realize that the recursive aspect of the market you may as well just ... design good compensation.  I'm also more enthusiastic about prediction markets on things mostly outside of employees' control that are still relevant to business decisions - market trends, actions of competitors and regulators, consumer preferences maybe. Though there's less reason for these to be internal.

The way info from the non-numerate gets incorporated into financial markets today is that more sophisticated people & firms scrape social media or look at statistics (like generated by consumer activity). markets do not need to be fully accessible for markets to be accurate.

I agree with this in general, but it doesn't seem true for the specific use-case motivating the post. The problem I am thinking about here is how to use a prediction market inside an organization. In this case we cannot rely on anyone who could get the information to put it into the... (read more)

3Sinclair Chen3mo
1. Prediction markets with even very small amounts of traders are remarkably well-calibrated. Try playing around with the data in Albeit these are traders out of a wider population. 2. I am skeptical of prediction markets as tools within organizations, even very large organizations (like Google's Gleangen or Microsoft). It hasn't been very useful, and I don't think this is a just a UX or culture issue, I think the information just isn't valuable enough. Better off running polls, doing user studies, or just letting project owners execute their big-brained vision. I more bullish of prediction markets that are part of a product/service, or part of an advertising campaign.

I really want to read the takedown of Helion.

For starters, see this and this. To summarize: * They're using D-He3 fusion to make less neutrons, because they want to capture electricity directly from the plasma, but that's harder than D-T fusion. * Plasma has MHD instabilities, which are also why solar flares happen. These are worse at higher power levels. Devices of the type Helion uses have been unable to manage conditions that produce much fusion, even D-T fusion, without instabilities getting bad. * Helion has said they rely on particle gyroradius in magnetic fields being comparable to plasma size for stability. But fusion requires many collisions, inevitably, so most particles would then escape before fusing. There is no solution for their approach.

I like the reasoning on the front, but I disagree. The reason I don't think it holds is because the Western Front as we understand it is what happened after the British Expeditionary Force managed to disrupt the German offensive into France, and the defenses that were deployed were based on the field conditions as they existed.

What I am proposing is that initial invasion go directly into the teeth of the untested defenses which were built for the imagined future war (which was over a period of 40 years or so before actual war broke out). I reason these def... (read more)

Indeed you might - in fact I suggested attacking through the French border directly in the other question where we aid Germany/Austria rather than try to prevent the war.

The idea of defending against France is an interesting one - the invasion plans called for knocking out France first and Russia second based on the speed with which they expected each country to mobilize, and Russia is much slower to conquer just based on how far everyone has to walk. Do you estimate choosing to face an invasion from France would be worth whatever they gain from Russia, in the thinking of German command?

I genuinely don't know anything about Germany's plans for Russia post invasion in the WW1 case, so I cannot tell.

Well, it turned out that attacking on The Western Front in WWI was basically impossible. The front barely moved over 4 years, and that was with far more opposing soldiers over a much wider front. So the best strategy for Germany would have been to dig in really deep and just wait for France to exhaust itself. At least that's my take as something of an amateur.
Answer by ryan_bNov 27, 202350

Under these conditions yes, through the mechanism of persuading German High Command to invade through the French border directly rather than going through Belgium. Without the Belgian invasion, Britain does not enter the war (or at least not so soon); without Britain in the war Germany likely does not choose unrestricted submarine warfare in the Atlantic; without unrestricted submarine warfare the US cannot be induced to enter the war on the side of the French.

As to why the direct invasion would work, we have the evidence from clashes in the field that the... (read more)

But the British could have entered the war anyway. After all, British war goals were to maintain a balance of power in Europe and they don't want France and Russia to fall and Germany to be too strong.
Answer by ryan_bNov 27, 202330

My best path for a yes is through the mechanism of Great Britain being very explicit with Germany about their intent to abide by the 1839 Treaty of London.

For context, this is the one where the signatories promise to declare war on whoever invades Belgium, and was Britain's entry point into the war. There were at least some high ranking military officers who believed that had Britain said specifically that they would go to war if Belgium were invaded, Germany would have chosen not to invade.

OK, but if I am roleplaying the German side, I might choose to still start WWI but just not attack through Belgium. I will hold the Western Front with France and attack Russia.

Power seeking mostly succeeds by the other agents not realizing what is going on, so it either takes them by surprise or they don’t even notice it happened until the power is exerted.

Yet power seeking is a symmetric behavior, and power is scarce. The defense is to compete for power against the other agent, and try to eliminate them if possible.

3Gerald Monroe3mo
I agree.  For power that comes from (money/reputation/military/hype) developing AI systems, this is where I was wondering where the symmetry is for those who wish to stop AI being developed.  The 'doomer' faction over time won't be benefitting from AI, and thus their relative power would be weaker and weaker with time.  Assuming at least initially, AI systems have utility value to humans, if the AI systems treacherous turn it will be later, after initially providing large benefits.   With this corporate battle right now, Altman is going to announce whatever advances they have made, raise billions of dollars, and the next squabble will have the support of people that Altman has personally enriched.  I heard a rumor it works out to be 3.6 million per employee with the next funding round, with a 2 year cliff, so 1.8 million/year on the line.  This would be why almost all OpenAI employees stated they would leave over it.  So it's a direct example of the general symmetry issue mentioned above.  (leaving would also pay: I suspect Microsoft would have offered more liquid RSUs and probably matched the OAI base, maybe large signing bonuses.  Not 1.8m but good TC.  Just me speculating, I don't know the offer details and it is likely Microsoft hadn't decided) The only 'obvious' way I could see doomers gaining power with time would be if early AI systems cause mass murder events or damage similar to nuclear meltdowns.  This would give them political support, and it would scale with AI system capability, as more powerful systems cause more and more deaths and more and more damage.

I agree with this, and I am insatiably curious about what was behind their decisions about how to handle it.

But my initial reaction based on what we have seen is that it wouldn’t have worked, because Sam Altman comes to the meeting with a pre-rallied employee base and the backing of Microsoft. Since Ilya reversed on the employee revolt, I doubt he would have gone along with the plan when presented a split of OpenAI up front.

I agree in the main, and I think it is worth emphasizing that power-seeking is a skillset, which is orthogonal to values; we should put it in the Dark Arts pile, and anyone involved in running an org should learn it at least enough to defend against it.

6Gerald Monroe3mo
What would be your defense?  Agents successful at seeking power have more power.  

I think the central confusion here is: why, in the face of someone explicitly trying to take over the board, would the rest of the board just keep that person around?

None of the things you suggested have any bearing whatsoever on whether Sam Altman would continue to try and take over the board. If he has no board position but is still the CEO, he can still do whatever he wants with the company, and also try to take over the board. If he is removed as CEO but remains on the board, he will still try to take over the board. Packing the board has no bearing on... (read more)

In principle, the 4 members of the board had an option which would look much better: to call a meeting of all 6 board members, and to say at that meeting, "hey, the 4 of us think we should remove Sam from the company and remove Greg from the board, let's discuss this matter before we take a vote: tell us why we should not do that".

That would be honorable, and would look honorable, and the public relation situation would look much better for them.

The reason they had not done that was, I believe, that they did not feel confident they could resist persuasion ... (read more)

0[comment deleted]3mo

Well gang, it looks like we have come to the part where we are struggling directly over the destiny of humanity. In addition to persuasion and incentives, we'll have to account for the explicit fights over control of the orgs.

Silver lining: it means we have critical mass for enough of humanity and enough wealth in play to die with our boots on, at least!

O O3mo3622

The last few days should show it's not enough to have power cemented in technicalities, board seats, or legal contracts. Power comes from gaining support of billionaires, journalists, and human capital. It's kind of crazy that Sam Altman essentially rewrote the rules, whether he was justified or not. 

The problem is the types of people who are good at gaining power tend to have values that are incompatible with EA. The real silver lining to me, is that while it's clear Sam Altman is power-seeking, he's also probably a better option to have there than the rest of the people good at seeking power, who might not even entertain x-risk.

I agree with all of this in principal, but I am hung up on the fact that it is so opaque. Up until now the board have determinedly remained opaque.

If corporate seppuku is on the table, why not be transparent? How does being opaque serve the mission?

I wrote a LOT of words in response to this, talking about personal professional experiences that are not something I coherently understand myself as having a duty (or timeless permission?) to share, so I have reduced my response to something shorter and more general. (Applying my own logic to my own words, in realtime!)

There are many cases (arguably stupid cases or counter-producive cases, but cases) that come up more and more when deals and laws and contracts become highly entangling.

Its illegal to "simply" ask people for money in exchange for giving them... (read more)

While I feel very unsure of a lot of my picks, I chose to interpret this as declaring a prior before the new evidence we expect comes out. I also took the unsure button route on several where my uncertainty is almost total.

I'm not sure this makes sense, except as a psychological trick.

This is indeed about half the pitch in my view. The strategy comes in two parts as I understand it: one, the psychological trick of triggering a fear of successful lawsuits; two, slightly increasing the likelihood that if the risk becomes reality they will have to pay significant penalties.

Ah, oops! My expectations are reversed for Shear; him I strongly expect to be as exact as humanly possible.

With that update, I'm inclined to agree with your hypothesis.

I would normally agree with this, except it does not seem to me like the board is particularly deliberate about their communication so far. If they are conscientious enough about their communication to craft it down to the word, why did they handle the whole affair in the way they seem to have so far?

I feel like a group of people who did not see fit to provide context or justifications to either their employees or largest shareholder when changing company leadership and board composition probably also wouldn't weigh each word carefully when explaining the ... (read more)

  1. The quote is from Emmett Shear, not a board member.
  2. The board is also following the "don't say anything literally false" policy by saying practically nothing publicly.
  3. Just as I infer from Shear's qualifier that the firing did have something to do with safety, I infer from the board's public silence that their reason for the firing isn't one that would win back the departing OpenAI members (or would only do so at a cost that's not worth paying). 
  4. This is consistent with it being a safety concern shared by the superalignment team (who by and large didn't
... (read more)

I don't understand the mechanism of the double-cross here. How would they get the pro-EA and safety side to trigger the crisis? And why would the safety/EA people, who are trying to make everything more predictable and controllable, be the ones who are purged from influence?

The EAs on the board have national security ties. 
One explanation that comes to mind is that AI already offers extremely powerful manipulation capabilities and governments are already racing to acquire these capabilities. I'm very confused about the events that have been taking place, but one factor that I have very little doubt is that the NSA has acquired access to smartphone operating systems and smartphone microphones throughout the OpenAI building (it's just one building, and a really important one, so it's also reasonably likely that it's also been bugged). Whether they were doing anything with that access is much less clear.

I am also confused. It would make me happy if we got some relevant information about this in the coming days.

If there is an email chain where all the engineers are speculating wildly about what could go wrong, then that posses a legal risk to the company, if and only if, they are later being sued because one of those wild speculations was actually correct.

Close - the risk being managed is one of total costs to go through the process, rather than their outcomes per se. So the risk to the company is increased if any of the wild speculations happens to be consistent with any future lawsuit, whether correct or spurious. How I think legal departments model this is tha... (read more)

The owner of a firm investing in R&D doesn’t account for all the benefits their technology might bring to non-paying consumers and firms, but they do care about the benefits that R&D will bring to the firm long into the future, even after their death. One part of this is that owners don’t face term limits that incentivize pump-and-dump attempts to garner voter support.

This does not match my expectations, even if it agrees with how I would feel were I the owner.

For example, the top ten Nasdaq companies spent ~$222B between them on R&D, which is ... (read more)

1Rob Lucas6d
"The average shareholder definitely does not care about the value of R&D to the firm long after their deaths, or I suspect any time at all after they sell the stock." This was addressed in the post: the price of the stock today (when its being sold) is a prediction of its future value.  Even if you only care about the price that you can sell it at today, that means that you care about at least the things that can lead to predictably greater value in the future, including R&D, because the person you're selling to cares about those things. Also worth noting: the reason that the 2% value is meaningful is that if firms captured 100% of the value, they would be incentivized to increase the amount produced such that the amount they create would be maximally efficient.  When they only capture 2% of the value, they are no longer incentivized to create the maximally efficient amount (stop producing it when cost to produce = value produced).  This is basically why externalities lead to market inefficiencies.  The issue isn't that they won't produce it at all, it's that they will underproduce it.

You stochastic parrot!

This, but unironically. What's the magic thing which keeps a human from qualifying as a stochastic parrot? And won't that magic thing be resolved when all the LLMs deploy with vector databases to resolve that pesky emphasis on symbols?

Because they have never shown an example in any of their takedowns of GTP4 that I have not also heard a human say.

I register a guess this is to keep the content of lesswrong from being scraped for LLMs and similar purposes.

3Nicolas Lacombe4mo
according to this comment it looks like a member of the lw site devs is ok with lw being scraped by gpt.

Because it makes it look like they're trying to conceal evidence, which is much worse for them than simply maybe being negligent.

The part that confuses me about this is twofold: one, none of the communication policies I have worked under went as far as to say something like "don't talk about risks vial email because we don't want it to be discoverable in court" which is approximately what would be needed to establish something like adverse inference; two, any documented communication policy is discoverable by itself, so I expect that it will wind up in evi... (read more)

What you're missing is how specific and narrow my original point was. The thing that makes it look like you are concealing evidence is only if you do two things simultaneously * Challenge witness testimony by saying it's not corroborated by some discoverable record. * Have a policy whereby you avoid creating the discoverable record, periodically delete the discoverable record, or otherwise make it unlikely that the record would corroborate the testimony. So basically you have to pick one or the other. And you're probably going to pick the 2nd. So as a witness, you probably just aren't going to have to worry about those kinds of challenges. This is a really good and interesting point, but I ultimately don't think it will work. I'm going to say I just told my boss because I didn't understand 34.B.II.c, nobody does. The fact that I didn't follow the exception rule isn't going to convince anyone that it didn't happen. The jury will get to see the communication policy in its full glory. Plaintiff counsel will ask me whether anyone ever explained the policy or rule 34.B.II.c to me (of course not), if we had a class to make sure we understood how to use the policy in practice like we had for the company sexual harassment policy (of course there was no class). And failure to follow the exception rule won't prove that I didn't think it was as serious. I told my boss and colleagues. People knew. What good would getting out my legal dictionary and parsing the text of 34.B.II.c do as far as helping people address the issues I raised? I dunno, I never even considered it as a possibility. I'm just a simple country engineer who wanted people to fix a safety issue. I'm not sure this makes sense, except as a psychological trick. Like, the legal risk is that the actual risk will manifest itself and then someone will sue. I feel like everyone just understands this clearly already. Remember I've already told everyone about the risk. If the risk manifests, of course we will be s

If I testify that I told the CEO that our widgets could explode and kill you, the opposition isn't going to be so stupid as to ask why there isn't any record of me bringing this to the CEO's attention. The first lawyer will be hardly able to contain his delight as he asks the court to mark "WidgetCo Safe Communication Guidelines" for evidence. The opposition would much rather just admit that it happened.

This feels like an extreme claim. What makes you conclude that personal testimony weighs more under the law than written documents? Why would the defense p... (read more)

I am a little confused by this. If there is an email chain where all the engineers are speculating wildly about what could go wrong, then that posses a legal risk to the company, if and only if, they are later being sued because one of those wild speculations was actually correct. That is not to say that the speculation is necessarily useful, an infinite list of speculative failure modes, containing a tiny number of realistic ones, is just as useless and a zero-length list. But I would prefer that the choice between a longer list (where true dangers are missed because they are listed alongside reams of nonsense) and a shorter list (where true dangers are missed because they were omitted)  was made to maximise effectiveness, not minimize legal exposure*. *This is not a criticism of any organisation or person operating sensibly within the legal system, but a criticism of said system.

What makes you conclude that personal testimony weighs more under the law than written documents?


It doesn't matter. Plaintiff wants to prove that an engineer told the CEO that the widgets were dangerous. So he introduces testimony from the engineer that the engineer told the CEO that the widgets were dangerous. Defendant does not dispute this. How much more weight could you possibly want? The only other thing you could do was to ask defendant to stipulate that the engineer told the CEO about the widgets. I think most lawyers wouldn't bother.

Why would

... (read more)

This is a really good post. It is simple, coherent, and does a good job accounting for problems.

A concrete example of this kind of thinking focused on getting stuff is jacobjacob's How I buy things when Lightcone wants them fast post.

Yeah, How I Buy Things is a pretty central example of this kind of thinking and has a good list of specific tactics. Thank you for the compliment! 

introduces concepts . . . into the framework of the legislation.

I strongly agree that the significance of the EO is that it establishes concepts for the government to work with.

From the perspective of the e/acc people, it means the beast has caught their scent.

Hey, that's the first time I've seen safety emphasized as a budgetary commitment. Huzzah for another inch in the overton window!

5Zach Stein-Perlman4mo said "we call on major tech companies and public funders to allocate at least one-third of their AI R&D budget to ensuring safety and ethical use"

While unusual for interpersonal stuff, practicing your speeches in every detail down to facial expressions and gestures is simply correct if you want to be good at making them.

So what's data science work like in the Israeli military?

Haha not that different from data science in other large organizations, but from time to time you do gruard shifts because it's the army. 

People try to make you do research on problems with next to no data, and there are literal human lives on the line, but you have to convince everyone it's not possible and you should focus on other, more approachable problems.

Let's take, for instance, a non classified problem: Automatically detecting people trying to sneak towards you. Seemingly easy: computer vision is a solved problem! However, sneaking tends to h... (read more)

7Yonatan Cale5mo
I was a software developer in the Israeli military (not a data scientist), and I was part of a course constantly trains software developers for various units to use.  The big picture is that the military is a huge organization, and there is a ton of room for software to improve everything. I can't talk about specific uses (just like I can't describe our tanks or whatever, sorry if that's what you're asking, and sorry I'm not giving the full picture), but even things like logistics or servers or healthcare have big teams working on them. Also remember the military started a long time ago, when there weren't good off-the-shelf solutions for everything, and imagine how big are the companies that make many of the products that you (or orgs) use.

Question: why is a set of ideas about alignment being adjacent to capabilities only a one-way relationship? More directly, why can't this mindset be used to pull alignment gains out of capability research?

4Thomas Kwa5mo
Not sure exactly what the question is, but research styles from ML capabilities have definitely been useful for alignment, e.g. the idea of publishing benchmarks.
5Nathaniel Monson5mo
I think it is 2-way, which is why many (almost all?) Alignment researchers have spent a significant amount of time looking at ML models and capabilities, and have guesses about where those are going.

Yes, and the virtue that is most important is the one that allowed Petrov to not doom the world. By contrast, the two most popular choices were about refusing to doom the world, and resisting social pressure, neither of which were features of the event.

If there was a poll in connection to Arkhipov, my answer might change.

I voted for correctly reporting your epistemic state. I claim that this is the actual virtue Petrov displayed, and that his primary virtue being "don't take actions which destroy the world" because he decided to buck the chain of command is a mistaken belief. From the Wikipedia article:

Petrov later indicated that the influences on his decision included that he had been told a US strike would be all-out, so five missiles seemed an illogical start;[3] that the launch detection system was new and, in his view, not yet wholly trustworthy; that the message pass

... (read more)
The poll did not ask what virtue Petrov most displayed. It asked what virtue you think is most important.

There seems to often be a missing ‘wait this is super scary, right?’ mood attached to making AI everyone’s therapist. There is obvious great potential for positive outcomes. There are also some rather obvious catastrophic failure modes.

Reach out and touch faith, amirite?

Yep, it’s nicely packaged right here:

To all doing that (directly and purposefully for its own sake, rather than as a mournful negative externality to alignment research): I request you stop.

The author, like many others, misunderstands what people mean when they talk about capabilities vs. alignment. They do not mean that everything is either one or the other, and that nothing is both

I had a similar reaction, which made me want to go looking for the source of disagreement. Do you have a post or thread that comes to mind which makes this distinction well? Most of what I am able to find just sort of gestures at some tradeoffs, which seems like a situation where we would expect the kind of misunderstanding you describe.

4Daniel Kokotajlo5mo
Perhaps this? Request: stop advancing AI capabilities - LessWrong 2.0 viewer (

At first blush this appears to guarantee Anthropic access to enough compute for the next couple of training iterations, at least. I infer the larger training runs are back on.

Can you explain how you estimated the coefficients of -3 and 6? I couldn't find where they came from in the rest of the piece.

1K. Liam Smith5mo
Thanks for asking. I used a logistic regression on a small dataset of slave revolts I made. It’s very small, so I have low confidence in the result, but I plan on doing a followup with plots of the dataset.
Load More