Thao Amelia Pham
Message
Working on multi-agent safety, hoping for the least-bad future. Other interests include s-risks (safe-pareto improvement) and theoretical computer science.
11
2
I think these advice can also be applied to adults, especially young adults, when many of them are in a transition phase, whose past mistakes might not be too deep or problematic to recover from, or their life principles are still forming.
Personally, I like number #8, specifically:
"Start planning the life you want by thinking freely in your own head. You can beat others by starting earlier because you respect yourself and haven’t fallen for the “children aren’t people”-style propaganda."
I feel like this is crucial for children who need to overcome adversa...
ArXiv paper here.
Most AI safety research asks a familiar question: Will a single model behave safely? But many of the risks we actually worry about – including arms races, coordination failures, and runaway competition – don’t involve one single AI model acting alone. They emerge when multiple advanced AI systems interact.
This post summarizes the findings of GT-HarmBench, a paper that shifts the lens of AI safety from isolated agents to multi-agent strategic interaction – multi-agent safety. Instead of asking whether an LLM makes good decisions in a vacuum, we ask a more deliberate question: can LLMs coordinate with each other when cooperation is the only way to avoid disaster?
I used to do translation, and I found vocabulary landscape quite trivial. I can understand both languages fluently but can’t find some equivalent characterization without losing a little bit of their semantic properties.
I feel the same struggle to tackle the Beetle problem. But I also think sometimes it’s fair to not try to force a perfect structural mapping between our internal language and theirs (which is circular if both are private). Likewise, if both beetles are genuinely incompatible, a thoughtful approximation is fine.