An RSS reader sends periodic requests to get the latest feed. This includes a User-Agent field, identifying which fetcher is running:

Feedbin feed-id:1242010 - 38 subscribers
This fetcher is nicely passing along statistics, saying how many readers it represents.

I took one day of logs, with 5,962 requests for my RSS feed:

$ sudo grep '"GET /news.rss ' \
    /var/log/nginx/access.log.1 \
  | awk -F'"' '{print $6}' \
  | wc -l
5962
There were 162 unique User-Agents:
$ sudo grep '"GET /news.rss ' \
    /var/log/nginx/access.log.1 \
  | awk -F'"' '{print $6}' \
  | sort \
  | uniq \
  | wc -l
162
Of the 5,962 requests, 932 (16%) gave stats:
$ sudo grep '"GET /news.rss ' \
    /var/log/nginx/access.log.1 \
  | awk -F'"' '{print $6}' \
  | grep 'subscriber\|reader' \
  | wc -l
932  
They sent 21 distinct User-Agents:
$ sudo grep '"GET /news.rss ' \
    /var/log/nginx/access.log.1 \
  | awk -F'"' '{print $6}' \
  | grep 'subscriber\|reader' \
  | sort \
  | uniq \
  | wc -l
21
Some sent multiple requests with different numbers of subscribers:
Feedbin feed-id:1242010 - 38 subscribers
Feedbin feed-id:372940 - 11 subscribers
Feedbin feed-id:382 - 1 subscribers
I suspect this comes from people using old URLs that then get redirected to my current URL. For example, now it's https://www.jefftk.com/news.rss, but it used to be http://www.jefftk.com/news.rss, and even longer ago it was an sccs.swarthmore.edu address. Summing subscriber counts, I see:
  • Feedly: 573
  • inoreader.com: 87
  • NewsBlur: 62
  • Feedbin: 50
  • theoldreader.com: 34
  • Dreamwidth Studios: 7
  • BazQux: 5
  • Bloglovin: 2
  • Feed Wrangler: 2
  • pine.blog: 1
While this only tells us about users who are subscribed to my blog, it seems like Feedly is the biggest player here by a lot.

Different services fetched at different intervals. Taking the shortest interval for each distinct User-Agent:

  • Feedly: 7min
  • Feedbin: 15min
  • Bloglovin: 30min
  • Dreamwidth Studios: 30min
  • Feed Wrangler: 30min
  • NewsBlur: 30min
  • BazQux: 40min
  • inoreader.com: 1hr
  • theoldreader.com: 2hr
  • pine.blog: 24hr

New to LessWrong?

New Comment
2 comments, sorted by Click to highlight new comments since: Today at 2:07 PM

I use QuietRSS which sets its User-Agent to "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.120 Safari/537.36"

Source: https://github.com/QuiteRSS/quiterss/blob/271c55756dcf19ca163a603a54d12f165548a602/src/main/globals.cpp#L98

Masquerading as Chrome is a mildly inconsiderate choice for an RSS reader to make, especially in not including a token for their own site. User Agent strings for visiting websites are a mess because of a history of people coding only to the dominant browser, but RSS does not have that history.

You do see things like Feedly using Feedly/1.0 (+http://www.feedly.com/fetcher.html; 452 subscribers; like FeedFetcher-Google), where they include the FeedFetcher-Google token, but there's really no reason to pretend to be a browser.

Looks like QuiteRSS has pretended to be a browser for years: https://github.com/QuiteRSS/quiterss/commit/38ad3ce6e72f90036f1db14568f33dbf346fc1b3 Opera/9.80 (Windows NT 6.1; U; YB/3.5.1; ru) Presto/2.10.229 Version/11.62