Be skeptical of milestone announcements by young AI startups

lc

Almost one year ago now, a company named XBOW announced that their AI had achieved "rank one" on the HackerOne leaderboard. HackerOne is a crowdsourced "bug bounty" platform, where large companies like Anthropic, SalesForce, Uber, and others pay out bounties for disclosures of hacks on their products and services. Bug bounty research is a highly competitive sport, and in addition to money it can give a security researcher or an engineer excellent professional credibility. The announcement of a company's claim to have automated bug bounty research got national press coverage, and many observers declared it a harbinger of the end of human-driven computer hacking.

The majority of XBOW's findings leading up to the report were made when the state of the art was o3-mini. It's almost a year later, after the releases of o3, GPT-5, GPT-5.1, GPT-5.2, and now GPT-5.3. If you took the intended takeaway from XBOW's announcement, you might expect that today's bug bounty platforms would be dominated by large software companies and their AIs. After all, frontier models have only gotten more effective at writing and navigating software, several other companies have entered the space since June 2024, and the barrier to getting the scaffolding required to replicate XBOW's research has only gone down. Why would humans still be doing bug bounties in 2026?

And yet they are. While XBOW has continued to make submissions since their media push, bug bounty platforms' leaderboards today are topped by pretty much the same freelance individuals that were using them previously. Many of these individuals now use AIs in the course of their work, but my impression based on both public announcements and personal conversation with researchers is that they are still performing most of the heavy lifting themselves.

Why the delay? Well, because press releases by AI application startups are ~~lies~~ designed to make a splash, and often intentionally mislead in ways that are hard for people who aren't insiders in a particular industry to detect. There are also often hard-to-understand gaps in the capabilities of these model+scaffolding combinations that are hard to articulate, but that make them impossible substitutions for real-world work.

Some details about XBOW's achievement that are not readily apparent from the press releases are:

XBOW's headline reads "For the first time in bug bounty history, an autonomous penetration tester has reached the top spot on the US leaderboard." However, XBOW never actually claimed to top HackerOne in earnings. They topped HackerOne in "reputation", a measure of both the amount of bugs you report and the percentage that were accepted. Inspecting their profile again and sorting by bounty, they've actually made less than $40,000 since they created their account in February 2024. Which is an impressive sum for a hobbyist, but well below what professional bug bounty hunters make, or even what very good red teamers make from bug bounties on the side.
XBOW's bug reports are mostly hidden, and it's impossible to look up exact numbers directly. From the selection of reports highlighted in their blog post, you would think that they submitted a wide variety of different bug classes. But using the leaderboard's category functionality while they were listed on the leaderboard, my friend and colleague reported on X at the time that 90% of the "score" that XBOW received was due to one category of issue, cross site scripting. XSS is real, but one of the easiest bug classes to find programmatically and to include in a reinforcement learning environment, which makes the spread suspicious.
As reported by XBOW themselves, every vulnerability XBOW has reported involved a human in the loop. This means that a highly paid security researcher was (at best) verifying whether or not each bug was real, and at worst was actively filtering the list of issues raised by the AI for interestingness.

Put another way, XBOW created a tool that flagged (mostly) a single type of issue across a wide variety of publicly available targets. Reports from this tool were then triaged by XBOW researchers, who then forwarded the reports to respective bug bounty programs, most of which were unpaid.

Is that an achievement? Yeah, probably, and I'm really not trying to beef with anybody at XBOW working hard to automate dynamic testing of software, but it's extremely different than the impression laypeople received from Wired's article about XBOW last year.

The only reason I know to look for these details is because I'm both a former security researcher and am building a company in the same space. I'm not a mathematician or a drug development specialist. Yet it's hard not to think of the XBOW story when I see announcements about AIs solving Erdos problems, or making drug discoveries.

LESSWRONG
LW

LESSWRONG
LW

19

Be skeptical of milestone announcements by young AI startups

19

19

19