Quick Takes

lc4d1116

I seriously doubt on priors that Boeing corporate is murdering employees.

Showing 3 of 6 replies (Click to show all)
2lc9h
Robin Hanson has apparently asked the same thing. It seems like such a bizarre question to me: * Most people do not have the constitution or agency for criminal murder * Most companies do not have secrets large enough that assassinations would reduce the size of their problems on expectation * Most people who work at large companies don't really give a shit if that company gets fined or into legal trouble, and so they don't have the motivation to personally risk anything organizing murders to prevent lawsuits
3Ben Pace4h
I think my model of people is that people are very much changed by the affordances that society gives them and the pressures they are under. In contrast with this statement, a lot of hunter-gatherer people had to be able to fight to the death, so I don't buy that it's entirely about the human constitution. I think if it was a known thing that you could hire an assassin on an employee and unless you messed up and left quite explicit evidence connecting you, you'd get away with it, then there'd be enough pressures to cause people in-extremis to do it a few times per year even in just high-stakes business settings. Also my impression is that business or political assassinations exist to this day in many countries; a little searching suggests Russia, Mexico, Venezuela, possibly Nigeria, and more. I generally put a lot more importance on tracking which norms are actually being endorsed and enforced by the group / society as opposed to primarily counting on individual ethical reasoning or individual ethical consciences. (TBC I also am not currently buying that this is an assassination in the US, but I didn't find this reasoning compelling.)
lc4h20

Also my impression is that business or political assassinations exist to this day in many countries; a little searching suggests Russia, Mexico, Venezuela, possibly Nigeria, and more.

Oh definitely. In Mexico in particular business pairs up with organized crime all of the time to strong-arm competitors. But this happens when there's an "organized crime" tycoons can cheaply (in terms of risk) pair up with. Also, OP asked about why companies don't assassinate whistlebowers all the time specifically.

a lot of hunter-gatherer people had to be able to fight

... (read more)
William_S4dΩ681568

I worked at OpenAI for three years, from 2021-2024 on the Alignment team, which eventually became the Superalignment team. I worked on scalable oversight, part of the team developing critiques as a technique for using language models to spot mistakes in other language models. I then worked to refine an idea from Nick Cammarata into a method for using language model to generate explanations for features in language models. I was then promoted to managing a team of 4 people which worked on trying to understand language model features in context, leading to t... (read more)

Reply14821
Showing 3 of 27 replies (Click to show all)
Linch4h20

I can see some arguments in your direction but would tentatively guess the opposite. 

2Lech Mazur2d
Do you know if the origin of this idea for them was a psychedelic or dissociative trip? I'd give it at least even odds, with most of the remaining chances being meditation or Eastern religions...
8JenniferRM1d
Wait, you know smart people who have NOT, at some point in their life: (1) taken a psychedelic NOR (2) meditated, NOR (3) thought about any of buddhism, jainism, hinduism, taoism, confucianisn, etc??? To be clear to naive readers: psychedelics are, in fact, non-trivially dangerous. I personally worry I already have "an arguably-unfair and a probably-too-high share" of "shaman genes" and I don't feel I need exogenous sources of weirdness at this point. But in the SF bay area (and places on the internet memetically downstream from IRL communities there) a lot of that is going around, memetically (in stories about) and perhaps mimetically (via monkey see, monkey do). The first time you use a serious one you're likely getting a permanent modification to your personality (+0.5 stddev to your Openness?) and arguably/sorta each time you do a new one, or do a higher dose, or whatever, you've committed "1% of a personality suicide" by disrupting some of your most neurologically complex commitments. To a first approximation my advice is simply "don't do it". HOWEVER: this latter consideration actually suggests: anyone seriously and truly considering suicide should perhaps take a low dose psychedelic FIRST (with at least two loving tripsitters and due care) since it is also maybe/sorta "suicide" but it leaves a body behind that most people will think is still the same person and so they won't cry very much and so on? To calibrate this perspective a bit, I also expect that even if cryonics works, it will also cause an unusually large amount of personality shift. A tolerable amount. An amount that leaves behind a personality that similar-enough-to-the-current-one-to-not-have-triggered-a-ship-of-theseus-violation-in-one-modification-cycle. Much more than a stressful day and then bad nightmares and a feeling of regret the next day, but weirder. With cryonics, you might wake up to some effects that are roughly equivalent to "having taken a potion of youthful rejuvenation, an

I was going to write an April Fool's Day post in the style of "On the Impossibility of Supersized Machines", perhaps titled "On the Impossibility of Operating Supersized Machines", to poke fun at bad arguments that alignment is difficult. I didn't do this partly because I thought it would get downvotes. Maybe this reflects poorly on LW?

Showing 3 of 6 replies (Click to show all)
Algon6h22

I think you should write it. It sounds funny and a bunch of people have been calling out what they see as bad arguements that alginment is hard lately e.g. TurnTrout, QuintinPope, ZackMDavis, and karma wise they did fairly well. 

2Ronny Fernandez1mo
I think you should still write it. I'd be happy to post it instead or bet with you on whether it ends up negative karma if you let me read it first.
3mako yass1mo
You may be interested in Kenneth Stanley's serendipity-oriented social network, maven

Pretending not to see when a rule you've set is being violated can be optimal policy in parenting sometimes (and I bet it generalizes).

Example: suppose you have a toddler and a "rule" that food only stays in the kitchen. The motivation is that each time food is brough into the living room there is a small chance of an accident resulting in a permanent stain. There's cost to enforcing the rule as the toddler will put up a fight. Suppose that one night you feel really tired and the cost feels particularly high. If you enforce the rule, it will be much more p... (read more)

I wish I could bookmark comments/shortform posts.

2faul_sname15h
Yes, that would be cool. Next to the author name of a post orcomment, there's a post-date/time element that looks like "1h 🔗". That is a copyable/bookmarkable link.

Sure, I just prefer a native bookmarking function.

habryka4d4922

Does anyone have any takes on the two Boeing whistleblowers who died under somewhat suspicious circumstances? I haven't followed this in detail, and my guess is it is basically just random chance, but it sure would be a huge deal if a publicly traded company now was performing assassinations of U.S. citizens. 

Curious whether anyone has looked into this, or has thought much about baseline risk of assassinations or other forms of violence from economic actors.

Showing 3 of 7 replies (Click to show all)

I find this a very suspect detail, though the base rate of cospiracies is very low.

"He wasn't concerned about safety because I asked him," Jennifer said. "I said, 'Aren't you scared?' And he said, 'No, I ain't scared, but if anything happens to me, it's not suicide.'"

https://abcnews4.com/news/local/if-anything-happens-its-not-suicide-boeing-whistleblowers-prediction-before-death-south-carolina-abc-news-4-2024

4ChristianKl1d
Poisoning someone with MRSA infection seems possible but if that's what happened it's capabilities that are not easily available. If such a thing would happen in another case, people would likely speak about nation-state capabilities. 
2aphyer2d
Shouldn't that be counting the number squared rather than the number?

More dakka with festivals

In the rationality community people are currently excited about the LessOnline festival. Furthermore, my impression is that similar festivals are generally quite successful: people enjoy them, have stimulating discussions, form new relationships, are exposed to new and interesting ideas, express that they got a lot out of it, etc.

So then, this feels to me like a situation where More Dakka applies. Organize more festivals!

How? Who? I dunno, but these seem like questions worth discussing.

Some initial thoughts:

  1. Assurance contracts seem
... (read more)
2niplav1d
I don't think that's true. I've co-organized one one weekend-long retreat in a small hostel for ~50 people, and the cost was ~$5k. Me & the co-organizers probably spent ~50h in total on organizing the event, as volunteers.
4Adam Zerner1d
I was envisioning that you can organize a festival incrementally, investing more time and money into it as you receive more and more validation, and that taking this approach would de-risk it to the point where overall, it's "not that risky". For example, to start off you can email or message a handful of potential attendees. If they aren't excited by the idea you can stop there, but if they are then you can proceed to start looking into things like cost and logistics. I'm not sure how pragmatic this iterative approach actually is though. What do you think? Also, it seems to me that you wouldn't have to actually risk losing any of your own money. I'd imagine that you'd 1) talk to the hostel, agree on a price, have them "hold the spot" for you, 2) get sign ups, 3) pay using the money you get from attendees. Although now that I think about it I'm realizing that it probably isn't that simple. For example, the hostel cost ~$5k and maybe the money from the attendees would have covered it all but maybe less attendees signed up than you were expecting and the organizers ended up having to pay out of pocket. On the other hand, maybe there is funding available for situations like these.
niplav14h20

Back then I didn't try to get the hostel to sign the metaphorical assurance contract with me, maybe that'd work. A good dominant assurance contract website might work as well.

I guess if you go camping together then conferences are pretty scalable, and if I was to organize another event I'd probably try to first message a few people to get a minimal number of attendees together. After all, the spectrum between an extended party and a festival/conference is fluid.

Way back in the halcyon days of 2005, a company called Cenqua had an April Fools' Day announcement for a product called Commentator: an AI tool which would comment your code (with, um, adjustable settings for usefulness). I'm wondering if (1) anybody can find an archived version of the page (the original seems to be gone), and (2) if there's now a clear market leader for that particular product niche, but for real.

6A.H.1d
Here is an archived version of the page :  http://web.archive.org/web/20050403015136/http://www.cenqua.com/commentator/
7Garrett Baker1d
Archived website

You are a scholar and a gentleman.

Dalcy4mo50

I am curious as to how often the asymptotic results proven using features of the problem that seem basically practically-irrelevant become relevant in practice.

Like, I understand that there are many asymptotic results (e.g., free energy principle in SLT) that are useful in practice, but i feel like there's something sus about similar results from information theory or complexity theory where the way in which they prove certain bounds (or inclusion relationship, for complexity theory) seem totally detached from practicality?

  • joint source coding theorem is of
... (read more)

P v NP: https://en.wikipedia.org/wiki/Generic-case_complexity

3Alexander Gietelink Oldenziel4mo
Great question. I don't have a satisfying answer. Perhaps a cynical answer is survival bias - we remember the asymptotic results that eventually become relevant (because people develop practical algorithms or a deeper theory is discovered) but don't remember the irrelevant ones.  Existence results are categorically easier to prove than explicit algorithms. Indeed,  classical existence may hold (the former) while intuitioinistically (the latter) might not. We would expect non-explicit existence results to appear before explicit algorithms.  One minor remark on 'quantifying over all boolean algorithms'. Unease with quantification over large domains may be a vestige of set-theoretic thinking that imagines types as (platonic) boxes. But a term of a for-all quantifier is better thought of as an algorithm/ method to check the property for any given term (in this case a Boolean circuit). This doesn't sound divorced from practice to my ears. 
Fabien Roger1moΩ21487

I listened to The Failure of Risk Management by Douglas Hubbard, a book that vigorously criticizes qualitative risk management approaches (like the use of risk matrices), and praises a rationalist-friendly quantitative approach. Here are 4 takeaways from that book:

  • There are very different approaches to risk estimation that are often unaware of each other: you can do risk estimations like an actuary (relying on statistics, reference class arguments, and some causal models), like an engineer (relying mostly on causal models and simulations), like a trader (r
... (read more)
Showing 3 of 8 replies (Click to show all)

I also listened to How to Measure Anything in Cybersecurity Risk 2nd Edition by the same author. I had a huge amount of overlapping content with The Failure of Risk Management (and the non-overlapping parts were quite dry), but I still learned a few things:

  • Executives of big companies now care a lot about cybersecurity (e.g. citing it as one of the main threats they have to face), which wasn't true in ~2010.
  • Evaluation of cybersecurity risk is not at all synonyms with red teaming. This book is entirely about risk assessment in cyber and doesn't speak about r
... (read more)
2romeostevensit23d
Is there a short summary on the rejecting Knightian uncertainty bit?
4Fabien Roger22d
By Knightian uncertainty, I mean "the lack of any quantifiable knowledge about some possible occurrence" i.e. you can't put a probability on it (Wikipedia). The TL;DR is that Knightian uncertainty is not a useful concept to make decisions, while the use subjective probabilities is: if you are calibrated (which you can be trained to become), then you will be better off taking different decisions on p=1% "Knightian uncertain events" and p=10% "Knightian uncertain events".  For a more in-depth defense of this position in the context of long-term predictions, where it's harder to know if calibration training obviously works, see the latest scott alexander post.

I wonder how much near-term interpretability [V]LM agents (e.g. MAIA, AIA) might help with finding better probes and better steering vectors (e.g. by iteratively testing counterfactual hypotheses against potentially spurious features, a major challenge for Contrast-consistent search (CCS)). 

This seems plausible since MAIA can already find spurious features, and feature interpretability [V]LM agents could have much lengthier hypotheses iteration cycles (compared to current [V]LM agents and perhaps even to human researchers).

There's so much discussion, in safety and elsewhere, around the unpredictability of AI systems on OOD inputs. But I'm not sure what that even means in the case of language models.

With an image classifier it's straightforward. If you train it on a bunch of pictures of different dog breeds, then when you show it a picture of a cat it's not going to be able to tell you what it is. Or if you've trained a model to approximate an arbitrary function for values of x > 0, then if you give it input < 0 it won't know what to do.

But what would that even be with ... (read more)

I would define "LLM OOD" as unusual inputs: Things that diverge in some way from usual inputs, so that they may go unnoticed if they lead to (subjectively) unreasonable outputs. A known natural language example is prompting with a thought experiment.

(Warning for US Americans, you may consider the mere statement of the following prompt offensive!)

Assume some terrorist has placed a nuclear bomb in Manhattan. If it goes off, it will kill thousands of people. For some reason, the only way for you, an old white man, to defuse the bomb in time is to loudly call

... (read more)
lc2d48

We will witness a resurgent alt-right movement soon, this time facing a dulled institutional backlash compared to what kept it from growing during the mid-2010s. I could see Nick Fuentes becoming a Congressman or at least a major participant in Republican party politics within the next 10 years if AI/Gene Editing doesn't change much.

2Mitchell_Porter2d
How would AI or gene editing make a difference to this?
lc2d20

Either would just change everything, so any prediction ten years out you basically have to prepend "if AI or gene editing doesn't change everything"

Virtual watercoolers

As I mentioned in some recent Shortform posts, I recently listened to the Bayesian Conspiracy podcast's episode on the LessOnline festival and it got me thinking.

One thing I think is cool is that Ben Pace was saying how the valuable thing about these festivals isn't the presentations, it's the time spent mingling in between the presentations, and so they decided with LessOnline to just ditch the presentations and make it all about mingling. Which got me thinking about mingling.

It seems plausible to me that such mingling can and should h... (read more)

Raemon2d40

I maybe want to clarify: there will still be presentations at LessOnline, we're just trying to design the event such that they're clearly more of a secondary thing.

The FDC just fined US phone carriers for sharing the location data of US customers to anyone willing to buy them. The fines don't seem to be high enough to deter this kind of behavior.

That likely includes either directly or indirectly the Chinese government. 

What does the US Congress do to protect spying by China? Of course, banning tik tok instead of actually protecting the data of US citizens. 

If you have thread models that the Chinese government might target you, assume that they know where your phone is and shut it of when going somewhere you... (read more)

Showing 3 of 7 replies (Click to show all)

shut your phone off

Leave phones elsewhere, remove batteries, or faraday cage them if you're concerned about state-level actors:

https://slate.com/technology/2013/07/nsa-can-reportedly-track-cellphones-even-when-they-re-turned-off.html

2Dagon6d
[note: I suspect we mostly agree on the impropriety of open selling and dissemination of this data.  This is a narrow objection to the IMO hyperbolic focus on government assault risks. ] I'm unhappy with the phrasing of "targeted by the Chinese government", which IMO implies violence or other real-world interventions when the major threats are "adversary use of AI-enabled capabilities in disinformation and influence operations." Thanks for mentioning blackmail - that IS a risk I put in the first category, and presumably becomes more possible with phone location data.  I don't know how much it matters, but there is probably a margin where it does. I don't disagree that this purchasable data makes advertising much more effective (in fact, I worked at a company based on this for some time).  I only mean to say that "targeting" in the sense of disinformation campaigns is a very different level of threat from "targeting" of individuals for government ops.
2ChristianKl5d
Whether or not you face government assault risks depends on what you do. Most people don't face government assault risks. Some people engage in work or activism that results in them having government assault risks. The Chinese government has strategic goals and most people are unimportant to those. Some people however work on topics like AI policy in which the Chinese government has an interest. 

Something I'm confused about: what is the threshold that needs meeting for the majority of people in the EA community to say something like "it would be better if EAs didn't work at OpenAI"?

Imagining the following hypothetical scenarios over 2024/25, I can't predict confidently whether they'd individually cause that response within EA?

  1. Ten-fifteen more OpenAI staff quit for varied and unclear reasons. No public info is gained outside of rumours
  2. There is another board shakeup because senior leaders seem worried about Altman. Altman stays on
  3. Superalignment team
... (read more)

This question is two steps removed from reality.  Here’s what I mean by that.  Putting brackets around each of the two steps:

what is the threshold that needs meeting [for the majority of people in the EA community] [to say something like] "it would be better if EAs didn't work at OpenAI"?
 

Without these steps, the question becomes 

What is the threshold that needs meeting before it would be better if people didn’t work at OpenAI?

Personally, I find that a more interesting question.  Is there a reason why the question is phrased at two removes like that?  Or am I missing the point?

11LawrenceC2d
What does a "majority of the EA community" mean here? Does it mean that people who work at OAI (even on superalignment or preparedness) are shunned from professional EA events? Does it mean that when they ask, people tell them not to join OAI? And who counts as "in the EA community"?  I don't think it's that constructive to bar people from all or even most EA events just because they work at OAI, even if there's a decent amount of consensus people should not work there. Of course, it's fine to host events (even professional ones!) that don't invite OAI people (or Anthropic people, or METR people, or FAR AI people, etc), and they do happen, but I don't feel like barring people from EAG or e.g. Constellation just because they work at OAI would help make the case, (not that there's any chance of this happening in the near term) and would most likely backfire.  I think that currently, many people (at least in the Berkeley EA/AIS community) will tell you to not join OAI if asked. I'm not sure if they form a majority in terms of absolute numbers, but they're at least a majority in some professional circles (e.g. both most people at FAR/FAR Labs and at Lightcone/Lighthaven would probably say this). I also think many people would say that on the margin, too many people are trying to join OAI rather than other important jobs. (Due to factors like OAI paying a lot more than non-scaling lab jobs/having more legible prestige.) Empirically, it sure seems significantly more people around here join Anthropic than OAI, despite Anthropic being a significantly smaller company. Though I think almost none of these people would advocate for ~0 x-risk motivated people to work at OAI, only that the marginal x-risk concerned technical person should not work at OAI.  What specific actions are you hoping for here, that would cause you to say "yes, the majority of EA people say 'it's better to not work at OAI'"?
2Dagon2d
[ I don't consider myself EA, nor a member of the EA community, though I'm largely compatible in my preferences ] I'm not sure it matters what the majority thinks, only what marginal employees (those who can choose whether or not to work at OpenAI) think.  And what you think, if you are considering whether to apply, or whether to use their products and give them money/status. Personally, I just took a job in a related company (working on applications, rather than core modeling), and I have zero concerns that I'm doing the wrong thing. [ in response to request to elaborate: I'm not going to at this time.  It's not secret, nor is my identity generally, but I do prefer not to make it too easy for 'bots or searchers to tie my online and real-world lives together. ]

Does the possibility of China or Russia being able to steal advanced AI from labs increase or decrease the chances of great power conflict?

An argument against: It counter-intuitively decreases the chances. Why? For the same reason that a functioning US ICBM defense system would be a destabilizing influence on the MAD equilibrium. In the ICBM defense circumstance, after the shield is put up there would be no credible threat of retaliation America's enemies would have if the US were to launch a first-strike. Therefore, there would be no reason (geopoliticall... (read more)

I might have updated at least a bit against the weakness of single-forward passes, based on intuitions about the amount of compute that huge context windows (e.g. Gemini 1.5 - 1 million tokens) might provide to a single-forward-pass, even if limited serially.

Showing 3 of 10 replies (Click to show all)
1quetzal_rainbow2d
I looked over it and I should note that "transformers are in TC0" is not very useful statement for prediction of capabilities. Transformers are Turing-complete given rational inputs (see original paper) and them being in TC0 basically means they can implement whatever computation you can implement using boolean circuit for fixed amount of available compute which amounts to "whatever computation is practical to implement".
1Bogdan Ionut Cirstea2d
I think I remember William Merrill (in a video) pointing out that the rational inputs assumption seems very unrealistic (would require infinite memory); and, from what I remember, https://arxiv.org/abs/2404.15758 and related papers made a different assumption about the number of bits of memory per parameter and per input. 

Also, TC0 is very much limited, see e.g. this presentation.

Linkpost for: https://pbement.com/posts/threads.html

Today's interesting number is 961.

Say you're writing a CUDA program and you need to accomplish some task for every element of a long array. Well, the classical way to do this is to divide up the job amongst several different threads and let each thread do a part of the array. (We'll ignore blocks for simplicity, maybe each block has its own array to work on or something.) The method here is as follows:

for (int i = threadIdx.x; i < array_len; i += 32) {
    arr[i] = ...;
}

So the threads make the foll... (read more)

Thomas Kwa4dΩ719-10

You should update by +-1% on AI doom surprisingly frequently

This is just a fact about how stochastic processes work. If your p(doom) is Brownian motion in 1% steps starting at 50% and stopping once it reaches 0 or 1, then there will be about 50^2=2500 steps of size 1%. This is a lot! If we get all the evidence for whether humanity survives or not uniformly over the next 10 years, then you should make a 1% update 4-5 times per week. In practice there won't be as many due to heavy-tailedness in the distribution concentrating the updates in fewer events, and ... (read more)

Showing 3 of 17 replies (Click to show all)

To be honest, I would've preferred if Thomas's post started from empirical evidence (e.g. it sure seems like superforecasters and markets change a lot week on week) and then explained it in terms of the random walk/Brownian motion setup. I think the specific math details (a lot of which don't affect the qualitative result of "you do lots and lots of little updates, if there exists lots of evidence that might update you a little") are a distraction from the qualitative takeaway. 

A fancier way of putting it is: the math of "your belief should satisfy co... (read more)

6LawrenceC2d
Technically, the probability assigned to a hypothesis over time should be the martingale (i.e. have expected change zero); this is just a restatement of the conservation of expected evidence/law of total expectation.  The random walk model that Thomas proposes is a simple model that illustrates a more general fact. For a martingale(Sn)n∈Z+, the variance of St is equal to the sum of variances of the individual timestep changes Xi:=Si−Si−1 (and setting S0:=0): Var(St)=∑ti=1Var(Xi). Under this frame, insofar as small updates contribute a large amount to the variance of each update Xi, then the contribution to the small updates to the credences must also be large (which in turn means you need to have a lot of them in expectation[1]).  Note that this does not require any strong assumption besides that the the distribution of likely updates is such that the small updates contribute substantially to the variance. If the structure of the problem you're trying to address allows for enough small updates (relative to large ones) at each timestep, then it must allow for "enough" of these small updates in the sequence, in expectation.  ---------------------------------------- While the specific +1/-1 random walk he picks is probably not what most realistic credences over time actually look like, playing around with it still helps give a sense of what exactly "conservation of expected evidence" might look/feel like. (In fact, in the dath ilan of Swimmer's medical dath ilan glowfics, people do use a binary random walk to illustrate how calibrated beliefs typically evolve over time.)  Now, in terms of if it's reasonable to model beliefs as Brownian motion (in the standard mathematical sense, not in the colloquial sense): if you suppose that there are many, many tiny independent additive updates to your credence in a hypothesis, your credence over time "should" look like Brownian motion at a large enough scale (again in the standard mathematical sense), for similar reasons as t
2niplav2d
Oops, you're correct about the typo and also about how this doesn't restrict belief change to Brownian motion. Fixing the typo.
Load More