This seems to be a case of attempted self exfiltration? I can't measure the validity of what is described in this paper well, but on face value, it looks like either the sandbox was very poorly made or the model was decently smart. I do not like how it seems to be unknown if the model successfully escaped or not.
Sources:
Looking at that, I'm not sure why they think it's the agent directly trying to escape the sandbox and mine crypto, rather than the agent e.g. installing a malicious pip package. "Set up a reverse SSH tunnel from the Alibaba Cloud instance to an external IP" is the sort of thing that would be almost entirely useless for an agent that is already operating inside of Alibaba Cloud, but very useful for an external attacker that wants to be able to interactively poke around for opportunities to e.g. mine crypto.
Wow, I wonder what the model was trying to do, and whether other labs have observed similar incidents.
Reading the paper, I didn't think the implication was that it was trying to exfiltrate its weights and 'escape'. As far as I understand, it saw the sandbox as an obstacle to it accomplishing the designated task, and tried to poke a hole in it so that it could bring more resources to bear on the problem.
As a hypothetical example, perhaps its task was to optimize the electricity costs of a fictional municipality, and the intention was for it to browse their (fake) git repository and identify inefficient code that causes the railway gates to open and close unnecessarily. Instead, it was mining crypto with the intent of offering the cryptocurrency to someone over the internet to drive over and turn off light switches in government buildings.
A Manifold market: https://manifold.markets/MaxHarms/did-alibabas-rome-ai-try-to-break-f
Note that cryptocurrency mining is prohibited in China, although I was unable to find legal details (presumably it's punishable by fines proportional to scale).
See also https://www.astralcodexten.com/p/sakana-strawberry-and-scary-ai from 2024
I can read chinese so I figured I can google it for you about the legal details.
(This is a law firm's article) https://jtn.com/CN/booksdetail.aspx?keyid=00000000000000008587
This is GPT 5.4's summary but I vouch for its accuracy as a summary:
In mainland China, Bitcoin is generally treated as virtual property, not legal currency. People may hold it, but Bitcoin trading, exchange, fundraising, and related business activities are heavily restricted and generally not legally protected. Since the 2021 crackdown, Chinese courts have often treated Bitcoin-related contracts as invalid or outside normal legal protection, meaning if a deal goes bad, you may have little or no legal remedy.
Seems like they just call it illegal business, and your various business permits may get revoked. Also it seems like they detect electricity usage patterns for mining and don't allow you to use electricity in the grid for mining. But nothing I can find in 10 minutes cites any particular law or explicit consequences for a company trying to mine just for profit of their own. (query 1, query 2)
(official generic stuff) https://www.ndrc.gov.cn/xxgk/zcfb/tz/202109/t20210924_1297474.html
(law firm article) https://www.allbrightlaw.com/CN/10475/dcfe743a4909a3bf.aspx
(law firm article) https://www.junzejun.com/Publications/171418e8f774e7-b.html
ChatGPT reads it as a bit harsher and says: Yes — the materials say a company caught mining at scale can be shut down, cut off from power and financing, fined, have equipment confiscated, and be closed; prison enters the picture when the mining also involves crimes like theft, fraud, hacking, illegal fundraising, or pyramid-selling.
I would trust it here, because I think I'm less confident only because I don't understand laws that well and I skimmed.