Sheikh Abdur Raheem Ali

Software Engineer at Microsoft who may focus on the alignment problem for the rest of his life (please bet on the prediction market here).

Markets say I'd earn more elsewhere, but the AGI notkilleveryoneism community has been vocally critical of MS.

What can I do that 60k developers can't? Translate ideas into silos that I have control over and help overcome chaotic internal communication barriers.

Wiki Contributions


There's an effect that works in the opposite direction where you lower the hiring bar as headcount scales. Key early hires may have a more stringent filter applied to them than later additions. But the bar can still be arbitrarily high, look at the profiles of people who are joining recently, e.g Leaving Wave, joining Anthropic |

It's important to be clear about what the goal is: if it's the instrumental careerist goal "increase status to maximize the probability of joining a prestigious organization", then that strategy may look very different from the terminal scientist goal of "reduce x-risk by doing technical AGI alignment work". The former seems much more competitive than the latter.

The following part will sound a little self-helpy, but hopefully it'll be useful:

Concrete suggestion: this weekend, execute on some small tasks which satisfy the following constraints:

  • can't be sold as being important or high impact.
  • won't make it into the top 10 list of most impressive things you've ever done.
  • not necessarily aligned with your personal brand.
  • has relatively low value from an optics perspective.
  • high confidence of trivially low implementation complexity.
  • can be abandoned at zero reputational/relationship cost.
  • isn't connected to a broader roadmap and high-level strategy.
  • requires minimal learning/overcoming insignificant levels of friction.
  • doesn't feel intimidating or serious or psychologically uncomfortable.

Find the tasks in your notes after a period of physical exertion. Avoid searching the internet or digging deeply into your mind (anything you can characterize as paying constant attention to filtered noise to mitigate the risk that some decision relevant information managed to slip past your cognitive systems). Decline anything that spurs an instinct of anxious perfectionism. Understand where you are first and marginally shift towards your desired position.

You sound like someone who has a far larger max step size than ordinary people. You have the ability to get to places by making one big leap. But go to this simulation Why Momentum Really Works ( and fix momentum at 0.99. What happens to the solution as you gradually move the step size slider to the right?

Chaotic divergence and oscillation. 

Selling your startup to get into Anthropic seems, with all due respect, to be a plan with step count = 1. Recall Expecting Short Inferential Distances. Practicing adaptive dampening would let you more reliably plan and follow routes requiring step count > 1. To be fair, I can kinda see where you're coming from, and logically it can be broken down into independent subcomponents that you work on in parallel, but the best advice I can concisely offer without more context on the details of your situation would be this: 

"Learn to walk".

Look into AMD MI300x. Has 192 GB HBM3 memory. With FP4 weights, might run GPT-4 in single node of 8 GPUs, still have plenty to spare for KV. Eliminating cross-node communication easily allows 2x batch size. 

Fungibility is a good idea, would take avg. KVUtil from 10% to 30% imo.

Thanks for this! I've been unsatisfied with my long form writing for some time and was going to make a pre-publication checklist for future posts, and customizing this for my personal use helps me save time on that.

I scored the answers using GPT-4.


GPT-4 scores under 60% on TruthfulQA according to page 11 of the tech report. How reliable are these scores?


Also, what do you think about this paper? Inference-Time Intervention: Eliciting Truthful Answers from a Language Model.

Have you read "[2009.09153] Hidden Incentives for Auto-Induced Distributional Shift ("? (It's cited in Jan Leike's Why I’m optimistic about our alignment approach (

> For example, when using a reward model trained from human feedback, we need to update it quickly enough on the new distribution. In particular, auto-induced distributional shift might change the distribution faster than the reward model is being updated.

I used to be less worried about this but changed my mind after the success of parameter-efficient finetuning with e.g LoRAs convinced me that you could have models with short feedback loops between their outputs and inputs (as opposed to the current regime of large training runs which are not economical to do often). I believe that training on AI generated text is a potential pathway to eventual doom but haven't yet modelled this concretely in enough explicit detail to be confident on whether it is the first thing that kills us or if some other effect gets there earlier. 

My early influences that lead me to thinking this are mostly related to dynamical mean-field theory, but I haven't had time to develop this into a full argument.

I can vouch that I have had the same experience (but am not allowed to share outputs of the larger model I have in mind). First encountered via curation without intentional steering in that direction, but I would be surprised if this failed to replicate with an experimental setup that selects completions randomly without human input. Let me know if you have such a setup in mind that you feel is sufficiently rigorous to act as a crux.

(It’s fine if the AI has access to a cached copy of the internet while in “boxed” mode, like the Bing chatbot does.)


I don't believe this is true. 

> We have learned that the ChatGPT Browse beta can occasionally display content in ways we don't want. For example, if a user specifically asks for a URL's full text, it might inadvertently fulfill this request.

Source: How do I use ChatGPT Browse with Bing to search the web? | OpenAI Help Center

Update: now that Vision Pro is out, would you consider that to meet your definition of "Transformative VR"?

But, of course, these two challenges were completely toy. Future challenges and benchmarks should not be. 


I am confused. I imagine that there would still be uses for toy problems in future challenges and benchmarks. Of course, we don’t want to have exclusively toy problems, but I am reading this as advocating for the other extreme without providing adequate support for why, though I may have misunderstood. My defense of toy problems is that they are more broadly accessible, require less investment to iterate on, and allow us to isolate one specific portion of the difficulty, enabling progress to be made in one step, instead of needing to decompose and solve multiple subproblems. We can always discard those toy solutions that do not scale to larger models.

In particular, toy problems are especially suitable as a playground for novel approaches that are not yet mature. These usually are not initially performant enough to justify allocating substantial resources towards but may hold promise eventually once the kinks are ironed out. With a robust set of standard toy problems, we can determine which of these new procedures may be worth further investigation and refinement. This is especially important in a pre-paradigmatic field like mechanistic interpretability, where we may (as an analogy) be in a geocentric era waiting for heliocentrism to be invented.

Load More