This is a linkpost for https://grgv.xyz/blog/awf/

There are many apps for blocking distracting websites: freedom.to, leechblock, selfcontrol, coldturkey, just to name a few. They are useful for maintaining focus, avoiding procrastination, and curbing addictive web surfing.

They work well for blocking a list of a few distracting websites. For me, this is not enough, because I’m spending a large portion of my time on a large number of websites, which I check out for a minute or two and then never visit again. It’s just impossible to maintain a blocklist for this long tail. Also, the web has grown so much that there are just too many easily found alternatives for any blocked distraction.

Well, GPT-4 to the rescue! With an LLM it’s possible to block websites based on the content, checking each page — if it’s distracting or useful/productive.

To test the idea I have implemented a prototype of a distraction filtering browser extension. This way, GPT-4 is turning into a personal productivity assistant!

The extension sends the content of each loaded page to OpenAI API, and asks GPT if the page should be blocked. The prompt can be edited in the config window; the following prompt is used by default:

You are a smart web filter, a distraction blocker with goal of improving user's productivity.
Following types of pages are distracting and should be blocked:
entertainment, shopping, online stores, social networking, news, magazines, lists of links, blogs without technical/educational content.
Following types of pages are useful for work and should not be blocked:
software development, technical information, general reference, manuals, answers to technical questions
Should a web page with following content be blocked? Answer only YES or NO, followed with a newline and a brief explanation.
===
{text}

Sensitive content, whitelist & blacklist.

While the extension is active, it sends a sample of each visited page’s content to OpenAI API. This might be a problem for pages with sensitive content.

You can add any domains which you do not want to expose to OpenAI to the whitelist or the blacklist. Pages that are matched are allowed or blocked without sending content to OpenAI.

OpenAI is claiming to handle user data securely, and to not use data submitted via API for model training. Still, if you have any concerns about the privacy and security of the pages that you visit, and if you do not want to risk leaking your browsing history, avoid using this extension.

Installation and testing

To try it out, download the extension (github.com/coolvision/awf/releases/download/0.1/awf-0.1.zip), install it (instructions), and enter your API key in the extension’s config page.

extension page

Then, navigate to any page, it might get blocked:

blocked page

Does it work?

I have been using it for a few days, and it does work quite well, with correct decisions in most cases.

One problem is that GPT-4 is expensive, and my usage has been up to ~$1/day. It would probably cost $10-30/month, which is not too much, but still a thing to improve.

Another issue is that OpenAI API is quite slow, it takes several seconds (up to 50-10s) to validate each page. I haven’t decided yet if it’s a feature or a problem — on one hand, it does make web browsing more mindful which is good, but then it does kill the flow/momentum when I want to quickly research something.

New Comment
11 comments, sorted by Click to highlight new comments since: Today at 4:55 PM

It might be worth trying this with GPT-3.5, since it's a relatively easy task and GPT-3.5 is 10x cheaper (and a little bit faster?).

[-]blf7mo159

The usual advice to get a good YES/NO answer is to first ask for the explanation, then the answer.  The way you did it, GPT4 decides YES/NO, then tries to justify it regardless of whether it was correct.

This reminds me about the comment on how effective LLM's will be for mass scale censorship.

I found an error in the application - when removing the last item from the blacklist, every page not whitelisted is claimed to be blacklisted. Adding an item back to the blacklist fixes this. Other than that, it looks good!

thanks!

[-][anonymous]7mo21

This fits very nicely as an antidote to the criticisms of AI from the following discussion: https://www.nytimes.com/2023/04/07/podcasts/ai-vibe-check-with-ezra-klein-and-kevin-tries-phone-positivity.html? 

This is certainly interesting! Though I will note that I find it pretty easy to block a bunch of websites with Freedom (and apps on desktop and portable devices!). They have a list of buttons that groups websites together. And then after a work session, I evaluate if the website was a time-waster for my work-time, if it was, I block it. I don’t find it too difficult. You just gotta be honest with yourself whether a website is wasting your time or not.

For example, I used to be able to convince myself that I need YouTube to be productive sometimes so I wouldn’t block it. Which was true sometimes, but then I’d often get sucked into a YouTube black hole of worthless videos. It’s just the kind of thing you can convince yourself, “I really need to look at this for work.” But actually no, it can wait and you are just rationalizing yourself out of actually doing hard (the most important) work.

So, now I have it blocked for most of the day and just download videos when it’s unblocked if I really need the video (I’ve been going through Andrej Karpathy’s videos every morning doing this, for example).

Honestly, I’m now on the verge of blocking LessWrong and the EA forum for most of the day too now! If I need a post, I’ll read it when I’m unblocked or read it through my article saver app.

Cool project!

I’m curious—what does the long tail of websites look like for you? For me, it’s the small number of sites that i repeatedly go to (twitter, youtube, hackernews, etc…) that take up the vast majority of my wasted time.

(Btw, I also built my own website blocker: https://chrome.google.com/webstore/detail/webblock/jeahkphmdfbddenabgndnooheiciocka)

well apparently after blocking the worst offenders I just wander quite randomly, according to RescueTime here are 5 1-minute visits making up 5 minutes I'm not getting back :)

store.steampowered.com 
rarehistoricalphotos.com 
gamedesign.jp 
corridordigital.com
electricsheepcomix.com

Could this work with email? Some emails are productive, and some are procrastination.