LESSWRONG
LW

Nathan Helm-Burger
4751Ω193615900
Message
Dialogue
Subscribe

AI alignment researcher, ML engineer. Masters in Neuroscience.

I believe that cheap and broadly competent AGI is attainable and will be built soon. This leads me to have timelines of around 2024-2027. Here's an interview I gave recently about my current research agenda. I think the best path forward to alignment is through safe, contained testing on models designed from the ground up for alignability trained on censored data (simulations with no mention of humans or computer technology). I think that current ML mainstream technology is close to a threshold of competence beyond which it will be capable of recursive self-improvement, and I think that this automated process will mine neuroscience for insights, and quickly become far more effective and efficient. I think it would be quite bad for humanity if this happened in an uncontrolled, uncensored, un-sandboxed situation. So I am trying to warn the world about this possibility. 

See my prediction markets here:

 https://manifold.markets/NathanHelmBurger/will-gpt5-be-capable-of-recursive-s?r=TmF0aGFuSGVsbUJ1cmdlcg 

I also think that current AI models pose misuse risks, which may continue to get worse as models get more capable, and that this could potentially result in catastrophic suffering if we fail to regulate this.

I now work for SecureBio on AI-Evals.

relevant quotes: 

"There is a powerful effect to making a goal into someone’s full-time job: it becomes their identity. Safety engineering became its own subdiscipline, and these engineers saw it as their professional duty to reduce injury rates. They bristled at the suggestion that accidents were largely unavoidable, coming to suspect the opposite: that almost all accidents were avoidable, given the right tools, environment, and training." https://www.lesswrong.com/posts/DQKgYhEYP86PLW7tZ/how-factories-were-made-safe 

 

"The prospect for the human race is sombre beyond all precedent. Mankind are faced with a clear-cut alternative: either we shall all perish, or we shall have to acquire some slight degree of common sense. A great deal of new political thinking will be necessary if utter disaster is to be averted." - Bertrand Russel, The Bomb and Civilization 1945.08.18

 

"For progress, there is no cure. Any attempt to find automatically safe channels for the present explosive variety of progress must lead to frustration. The only safety possible is relative, and it lies in an intelligent exercise of day-to-day judgment." - John von Neumann

 

"I believe that the creation of greater than human intelligence will occur during the next thirty years.  (Charles Platt has pointed out the AI enthusiasts have been making claims like this for the last thirty years. Just so I'm not guilty of a         relative-time ambiguity, let me more specific: I'll be surprised if this event occurs before 2005 or after 2030.)" - Vernor Vinge, Singularity

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
4Nathan Helm-Burger's Shortform
3y
130
No wikitag contributions to display.
Nathan Helm-Burger's Shortform
Nathan Helm-Burger12d20

I agree with this post that claims that research taste and compute budgets are fairly fungible: https://x.com/francoisfleuret/status/1958211714601607441

Reply
How Does A Blind Model See The Earth?
Nathan Helm-Burger19d72

There's also vast.ai and lamda labs. And prime intellect.

Reply
LLMs Can't See Pixels or Characters
Nathan Helm-Burger1mo40

Update: Open Source paper on this just came out: https://x.com/_sunil_kumar/status/1952906246182584632

Reply
Shallow Water is Dangerous Too
Nathan Helm-Burger1mo40

I was taught to 'swim' and comfortable with holding my breath and getting to the side of a pool and back-floating all before the age of 1.

One of the closest times I came to drowning was at age 4, when I was playing with a half-full bucket of water and fell in. I held my breath and wiggled my legs to tip the bucket over, and was fine. But had I not held my breath and instead inhaled water, I probably would not have been fine.

Reply
Spilling the Tea
Nathan Helm-Burger1mo52

Not super helpful to me, since I am (not coincidentally) in a stable long-term relationship, but I'm confident that all my exes would give me green flags. That would be... very few green flags though. So... yeah, you might worry about a lot of green flags if what you wanted was something long term.

Reply
Teaching kids to swim
Nathan Helm-Burger1mo50

Passively floating on your back is hard for skinny folk! Much easier to backfloat while moving, and it's still a much lower energy activity than treading water upright. I wouldn't recommend a kid trying to practice backfloating while holding still unless they're naturally buoyant. Instead, the question is 'how little energy can you expend while staying up', and 'tracking your position by looking at the ceiling so you don't bump your head while doing backstroke laps'. The faster you go, the easier it is to stay up, but the more energy you expend. There's a comfortable medium that'll be different for each person, and change as their body changes.

Reply
Teaching kids to swim
Nathan Helm-Burger1mo60

Yeah, I was comfortable swimming before I could walk. Not like, make good progress per say, but like, if I fell in water with no flotation assistance, I could comfortably hold my breath, orient, get to the surface, and float on my back comfortably without assistance.

Reply
Teaching kids to swim
Nathan Helm-Burger1mo50

My parents, especially my dad, love water and swimming. So I was taught to "swim" before I could crawl, mostly just learning to hold my breath as I was bobbed around in the water, and then to manage to keep my head mostly above water (when intended) and navigate (slowly) around with arm floaties. My younger brother didn't get taught until later, like maybe 1 and a half. He also loves the water, but not as much as me. I feel so at home in the water and under water. So personally, I'd recommend starting young and not having any particular goals in mind for the first few years other than "hang out in pool with kid for 20-30 min, and don't let them drown". Actual swimming strokes can come much later.

Reply
LLMs Can't See Pixels or Characters
Nathan Helm-Burger1mo174

Personally, I think there's a simple fix to this for agentic systems, and it has already been implemented for o3. Zoom and pan. If the model can zoom in all the way to the pixel level, and then beyond, so that the pixels get duplicated in all directions until a single repeated "pixel" fills and entire visual token patch... then the perceptual problem is gone. Now it's back to a general intelligence problem. I'm not saying that's an optimal solution, just a simple one.

Reply
A night-watchman ASI as a first step toward a great future
Nathan Helm-Burger1mo20

Yes, work is being done by some to explore the idea of decentralized peer-to-peer consensual inspections. For things like biolabs that want to reassure each other that none of their student volunteers is up to bad stuff.

Reply
Load More
76Unfaithful Reasoning Can Fool Chain-of-Thought Monitoring
Ω
3mo
Ω
17
10Proactive 'If-Then' Safety Cases
9mo
0
53A path to human autonomy
10mo
16
14My hopes for YouCongress.com
1y
3
9Physics of Language models (part 2.1)
1y
2
19Avoiding the Bog of Moral Hazard for AI
1y
13
14A bet for Samo Burja
1y
2
13Diffusion Guided NLP: better steering, mostly a good thing
1y
0
18Imbue (Generally Intelligent) continue to make progress
1y
0
27Secret US natsec project with intel revealed
1y
1
Load More