I’m curious—what does the long tail of websites look like for you? For me, it’s the small number of sites that i repeatedly go to (twitter, youtube, hackernews, etc…) that take up the vast majority of my wasted time.

(Btw, I also built my own website blocker: https://chrome.google.com/webstore/detail/webblock/jeahkphmdfbddenabgndnooheiciocka)

Reply

Startup Roundup #1: Happy Demo Day

Simon Berens7mo44

I think the main beneficiaries of being able to sideload apps will be incumbents, not startups. Big companies like Spotify, Netflix, and Tinder will offer users discounts if they sideload because it will spare them the 30% Apple tax.

Reply

Sharing Information About Nonlinear

Simon Berens8mo284

I am confused how to square your claim of requesting extra time for incontrovertible proof, with Ben’s claim that he had a 3 hour call with you and sent the summary to Emerson, who then replied “good summary!”

Was Emerson’s full reply something like, “Good summary! We have incontrovertible proof disproving the claims made against us, please allow us one week to provide it?”

Reply

1

Twitter Twitches

Simon Berens10mo140

Bloomberg reported 2 weeks ago that Twitter resumed paying Google Cloud: https://www.bloomberg.com/news/articles/2023-06-21/twitter-resumes-paying-google-cloud-patching-up-relationship

Reply

Want to predict/explain/control the output of GPT-4? Then learn about the world, not about transformers.

Simon Berens1y21

You might want to clarify that, because in the post you explicitly say things like “if your goal is to predict the logits layer, then you should probably learn about Shakespearean dramas, Early Modern English, and the politics of the Late Roman Republic.”

Reply

Want to predict/explain/control the output of GPT-4? Then learn about the world, not about transformers.

Simon Berens1y125

This is probably obvious, but maybe still worth mentioning:

It’s important to take into account the ROI per unit time. In the amount of time it would take for me to grok transformers (let’s say 100 hours), I could read ~1 million tokens, which is ~0.0002% of the training set of GPT3.

The curves aren’t clear to me, but i would bet grokking transformers would be more effective than a 0.0002% increase in training set knowledge.

This might change if you only want to predict GPT’s output in certain scenarios.

Reply

Bing Chat is a Precursor to Something Legitimately Dangerous

Simon Berens1y30

I agree that recursive self-improvement can be very very bad; in this post I meant to show that we can get less-bad-but-still-bad behavior from only (LLM, REPL) combinations.

Reply

Buy Duplicates

Simon Berens1y10

Yeah, this isn't something I have an ugh field around, but having portable versions of travel stuff like shampoo, skincare, and chargers ready to go is nice.

Reply