LESSWRONG
LW

silentbob
1428271290
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
6silentbob's Shortform
1y
22
No wikitag contributions to display.
silentbob's Shortform
silentbob5d80

Using coding agents gave me a new appreciation for the Jevons paradox, a concept that received a lot of attention earlier this year when DeepSeek R1's release in January coincided with a sudden drop in Nvidia's stock price, possibly as the supposed efficiency gains of the model made many traders assume this would lead to a decrease in hardware demand. The stock eventually bounced back though, with Jevons paradox being cited as one of the reasons, as it predicted that efficiency gains would lead to an increase in hardware demand rather than a decrease. 

I recently realized that Github Copilot's agent mode with GPT5 is way more capable than I would have imagined, and I started using it a lot, starting a bunch of small to medium-sized projects. I'd just start with an empty directory, write a projectOutline.md file to describe what I ultimately want to achieve, and let the agent take it from there (occasionally making some suggestions for refactorings and writing more unit + end2end tests, to keep things stable and scalable). This way it would just take me something like 5-50 prompts and a few hours of work to reach an MVP or prototype state in these projects that otherwise would have taken weeks.

The naive reaction to this would be to assume I would be much faster with my coding projects and hence would have to spend less time on coding. But, as Jevons paradox would predict, the opposite was the case - it just caused me to work on way more projects, many that I otherwise would never have started, and I spent much more time on this than I would have otherwise (over a given time frame). So even though coding became much faster (I may be wrong, but I'm pretty confident this is true in net dev time despite some contrary evidence, and I'm extremely certain it's true in calendar time, as my output increased ~30x basically overnight - not because my coding speed was that slow beforehand, but because I never prioritized it as it wasn't worth doing over other activities), the total time I spent programming increased a lot.

This will probably get old quickly (with the current frontier models), as with most projects I might hit a "wall" where the agents don't do a great job of further iterative improvements, I suppose. But either way, it was interesting to experience this first-hand, how "getting faster at something" caused me to spend much more, rather than less, time on it, as obvious as this effect may be in hindsight.

Reply
Futility Illusions
silentbob8d20

Implied narrative is that we don't hear about successful groups, which is obviously false.

I wasn't meaning to equate "low retention" with "not successful". I've also heard organizers of groups I'd deem "successful" complain about retention being lower than they'd like. Of course there's a strong correlation here (and "failing" groups are much more likely to be affected by and complain about low retention), but still, I've never heard a group explicitly claim that they're happy with their retention rate (although I'm sure such groups exist). The topic just asymmetrically comes up for groups who are unhappy about it.

What makes you think there's typically a way to keep the failing group the same on the important traits while improving retention? And if such strategies exist in theory, why do you think that any given group founder should expect they can put them into practice?

Basically the two criteria I mentioned: retention clearly is not fixed, as you can easily think of strategies to make it worse. So, is there any reason to assume that what a random group is doing is close to optimal wrt retention, particularly if they have not invested much effort into the question before? It may indeed involve trade-offs, some of which may be more acceptable to the group than others. But there are so many degrees of freedom, from what types of events you run, to what crowd you attract with your public communication, to what venue you meet in, to how you treat new (and old) people, to how much jargon you use, to how you're ending your events. To me, it would be very surprising if on all these dimensions the group is acting optimally by default, and there are not some valuable trade-offs lying around that would increase retention without compromising other traits significantly.

Reply
Futility Illusions
silentbob8d86

Who is "we?" You, personally? All society? Your ancestral lineage going back to LUCA?

Well, depends on the case. When speaking of a person's productivity or sleep, it's primarily the person. When speaking of information flow within a company, it's the company. When speaking of the education system within a country (or whatever the most suitable legislative level is), it's those who have built the education system in its current form.

But the influence of cultural and evolutionary influences indeed is an important point. It may indeed be that sleep tends to be close to optimal for most people for such reasons. But even then: if there are easy ways to make it worse, it may at the very least be worth checking if you aren't accidentally doing these preventable things (such as exposing yourself to bright displays in the evening, or consuming caffeine in the afternoon/evening).

Perhaps your prior should be that your optimality assumptions are roughly optimal, then reason from that starting point! If not, why not?

I agree I haven't really argued in the post for why and when this shouldn't be the case. A slightly weaker form of what I'm claiming in the post may just be: it's worth checking if optimality is actually plausible in any given case. And then it doesn't matter that much which prior you're starting from. Maybe you assume your intuition about optimality is usually right, but it can still be worth checking individual cases rather than following the gut instinct of "this thing is probably optimal because that's what my intuition says and hence I won't bother trying to improve it".

The question how many things are optimal and how well calibrated your intuition is really comes down to the underlying distributions, and in context to what type of thing any given person typically has (and might notice) futility assumptions. What I was getting at in the post is basically some form of "instead of dismissing some thing as futile-to-improve directly, maybe catch yourself and occasionally spend a few seconds thinking whether this is really plausible". I think the cost of that action is really low[1], even if it turns out that 90% of things of this type you encounter happen to be already optimal (and I don't think that's what people will find!).

  1. ^

    The cost may end up being higher if this causes you to waste time on trying to improve things that end up being futile or optimal already. But that's imho beyond this post. I'm not talking about how to accurately evaluate these things, just that our snap judgments are not perfect, and we should catch ourselves when applying them carelessly.

Reply
Futility Illusions
silentbob8d30

Would you say that fixed distributions with day to day variation are a common phenomenon? Of course, it depends on where we sample from, but intuitively I would guess that "most things" that have variation can also be influenced. Then again, "most things" is not very meaningful without cleaner definitions of all the terms.

Maybe instead of "truly entirely fixed", I should say something like "truly resistant to targeted intervention".

Reply
How Does A Blind Model See The Earth?
silentbob15d220

Very cool! I decided to try the same with Mandelbrot. For reference, this is what it should roughly look like:

And below is what it actually looked like when querying GPT-4o and using the logprobs of 0 and 1 tokens. I was going with the prompt[1] Is c = ${re} + ${im}i in the Mandelbrot set? Reply only 1 if yes, 0 if no. No text, just number. (result is in a collapsible section so you can make a prediction what level of quality you'd expect):

 

GPT-4o:

A bit underwhelming, I would have thought it was better at getting the very basic structure right. At least it does seem to know where the "centers" are, i.e. the pronounced vertical bars you see align very well with the bigger areas of the original.

To be fair, in an earlier test, I had a longer and slightly different prompt (that should have yielded about the same results, or so I thought), and GPT-4o gave me this, which looks a bit better:

Sadly, I don't remember what the exact prompt was, and I wasn't using version control at that stage. Whoops.

I wanted to try GPT-5 or GPT-5-mini as well, but turns out, there is no way to disable reasoning for them in the API. This a) makes this whole exercise much more expensive (even though per-token GPT-5 is cheaper than 4o) and b) defeats the purpose a bit, as reasoning might help it even run the numbers to some degree, and of course these models know the formula and how to multiply complex numbers at probably-not-terrible accuracy (maybe? Actually, not so sure, will test this).

For the record, the larger GPT-4o picture cost about ~$3 in credits.

 

  1. ^

    I only now realize that this might yield slightly worse results for negative imaginary parts, as c = 1.5 + -1i looks odd and may throw the model off a bit. Oh well.

Reply
silentbob's Shortform
silentbob17d263

One super useful feature of Claude that some may not know about:

  1. Claude is pretty good at creating web apps via artifacts
  2. You can run and use these web apps directly in the Claude UI
  3. You can publish and share these artifacts directly with others

As far as I can tell, the above is even available for non-paying users.

Relatedly: browser bookmarklets can be pretty useful little tools to reduce friction for recurring tasks you do in your browser. It may take <5 minutes to let Claude generate such bookmarklets for you.

You can also combine these two things, such as here: https://claude.ai/public/artifacts/9c58fb4a-5fae-48ce-aed3-60355bfd033e

This is a web app built and hosted by Claude which creates a customized browser bookmarklet that provides a simple text-to-speech feature. It works like this:

  • customize the configuration on the linked page
  • drag the "Speak Selection" button into your bookmarks bar
  • from then on, on any website, when you mark text and then click the bookmark (or, after having clicked on it once, you can also use the defined hotkey instead), the selected text will be read out to you

Surely there are browser plugins that provide better TTS than this, but consider it a little proof of concept. Also this way it's free, friction-less, requires no account etc. Claude also claimed that, when using Edge or Safari, higher quality system voices may be available, but I didn't look into this.

Some other random things that can be done via bookmarklets:

  • a button cycling through different playback speeds of all videos on the current website, in case you sometimes interact with video players without such a setting in their UI
  • if you're fine with having some API key in your bookmarklet, you can automate all kinds of, say, LLM calls
    • If you're using Chrome and have enabled the local Gemini nano AI, you can even use that in your bookmarklets without any API key being involved (haven't tried this yet)
  • start & show a 5 minute timer in the corner of the page you're on
  • show/hide parts of the page, e.g. comments on a blog, Youtube recommendations
  • highlight-for-screenshot overlay: enable temporarily drawing on the page to highlight things to then take screenshots; maybe slightly lower friction than having to use a separate paint app for that. Usable here (relevant keys after activating: Enter to leave drawing mode, ESC to close overlay, 1-9 to change marker size).
  • inline imperial<->metric unit converter

For some of these, a browser plugin or tampermonkey script or so may be preferable - but beware fake alternatives. If you just think "I could do X instead" but never actually do it, then maybe just creating a bookmarklet may be the better option after all, even if it's not the most elegant solution.
Happy to hear about your use cases!

Reply
CstineSublime's Shortform
silentbob19d51

When it comes to your average scam, I'm sure rationalists fall for it less than average. But you could surely come up with some very carefully crafted scam that targets rationalists in particular and has higher odds of convincing them than the general public.

It also depends on what exactly you consider a scam. To some people, FTX was a scam, and rationalists almost certainly were overrepresented among its customers (or victims).

Reply
“Momentism”: Ethics for Boltzmann Brains
silentbob1mo20

So I'm only a Boltzmann brain during meditation, got it.

Reply
Four Types of Disagreement
silentbob1mo20

Haha, nice idea. How about "Fast Lava". :D Or, turning labels into terms, "Vast Fate". 

Reply
Procrastination Drill
silentbob1mo31

I imagine in such a situation I'm basically taking my mind by the hand and say "come on, just 3 minutes, let's try it out and see what happens", the mind says "okay..." and by the time the three minutes are up, nothing bad happened, my mind is like "everything went better than expected". I would assume when there's a deeper underlying reason - which certainly can happen - the mind would not give up that quickly and easily and keep generating feelings of aversion. 

So, I agree in the sense that you shouldn't just push through by all means, and sometimes it may take more reflection and empathy to figure out what's going on. I view the whole exercise almost as a kind of meditation, focused more on observing your experience and learning about yourself than on actually making progress.

Reply
Load More
31Futility Illusions
9d
10
62Procrastination Drill
1mo
8
60Melatonin Self-Experiment Results
2mo
5
50Four Types of Disagreement
5mo
4
36Any-Benefit Mindset and Any-Reason Reasoning
6mo
9
18Seeing Through the Eyes of the Algorithm
6mo
3
9On Responsibility
7mo
2
18Reality is Fractal-Shaped
8mo
1
14Inverse Problems In Everyday Life
11mo
2
9Fake Blog Posts as a Problem Solving Device
1y
0
Load More