Vale's Shortform

27th Mar 2025

1 min read

2

This is a special post for quick takes by Vale. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.

20 comments, sorted by

top scoring

Click to highlight new comments since: Today at 3:11 PM

[-]Vale4mo120

Taking time away from something and then returning to it later often reveals flaws otherwise unseen. I've been thinking about how to gain the same benefit without needing to take time away.

Changing perspective is the obvious approach.

In art and design, flipping a canvas often forces a reevaluation and reveals much that the eye has grown blind to. Inverting colours, switching to greyscale, obscuring, etc, can have a similar effect.

When writing, speaking written words aloud often helps in identifying flaws.

Similarly, explaining why you've done something – à la rubber duck debugging – can weed out things that don't make sense.

[-]CstineSublime4mo*80

I've been thinking about how to gain the same benefit without needing to take time away.

Ive been thinking about this for years too. So if you do come up with a definitive practical system for this, I would love to know.

I can't vouch for the validity of all of them but these are some other approaches I'm aware of:

Role Playing - looking at a canvas and asking (either literally or just thinking silently in a similar manner) in quick succession "What would Yaoi Kusama do now with it?" "What would Rothko do with it?" "What would Vermeer do with it?" "What would Egon Shiele do with it?".

This can be coupled with rubber-ducking where you imagine a dialogue with the person you're role-playing.

Change the Medium - you've already mentioned speaking out written words, but other approaches might involve putting a Word Document into PDF form and reading it, maybe switching from a tablet to a desktop or vice versa. Then there is of course the option of physically printing it. Changing the typeface and font size are also ways of evoking similar effects.

Starting from Scratch - sometimes it is useful to rather than wait to come back to something, simply stop working on the first draft arbitrarily and begin from the beginning again during the same session. Unless you're a human xerox machine you're unlikely to create a near-perfect copy, those small differences may suggest new and effective paths.

The Cut-up Approach - David Bowie used this technique^[1], and got the idea himself from William Burroughs. The original process was to take text, like lyrics, and cut out individual lines into strips and then shuffle them about. While the signal to noise ratio may be a bit onerous, if you're trying to use it as a means of stoking new perspectives it could be effective (especially nowadays where can use any number of automated methods to do this for us).
Vladimir Nabokov used a similar technique, while he wrote on numbered index cards, he could - in a pinch, shuffle them and "deal himself a novel"^[2].^[3]

Devil's Advocate/Steelmanning your opposition - If I'm writing something, sometimes I try to first reduce my bias by trying to dishonestly but persuasively as possible write an essay for the opposing view. I believe this was an important way of teaching classical rhetoric^[4]

Asking yourself 'why?'- This advice I am hypocritical to give as I need to take it myself. This applies both on the micro and macro levels. Why are you writing this function in your code? (and why are you using this particular algorithm to do it?) all the way up to what is the intended purpose of this entire project?
If you're painting something, why are you painting this blue dot in the bottom right hand corner? Why blue?
I am prone to bouts of fixation and forget the actual intention or utility that lead me down this rabbit hole in the first place - noticing when I'm fixated, and reminding myself of the original intention is often a good way of changing perspective. But this more applies to the feeling of "stuckness" or "bashing your head against a wall".

^{^}
I was under the impression he wrote Life on Mars this way, but it appears I've confabulated reviewers descriptions of "cut up lyrics"
^{^}
Writing on index cards, in pencil, had become Nabokov's preferred method of composition. He would fill each card with narrative and dialogue, shuffle the completed pack and then, in the words of his editor, "deal himself a novel".
https://www.theguardian.com/books/2009/oct/25/nabokov-original-of-laura-mccrum
^{^}
I am speculating here - but I believe that the very interesting repetitions and assonance that you'll find in some of Nabokov's later books, where a Dog seen by Hugh Pearson in an early chapter is brought back years later in a final chapter, a throwaway reference to a inmate in prison for strangulation which foreshadows another character's fate, and other butterfly-wing-like symmetries are probably a direct result of this method of shuffling.
^{^}
In the very end of Aristotle's Topics he seems to suggest that students of Rhetoric and Philosophy should actively adopt an opposition to their definitions in an argument to mitigate any "shortcomings"
In combating definitions it is always one of the chief elementary principles to take by oneself a happy shot at a definition of the object before one, or to adopt some correctly expressed definition. For one is bound, with the model (as it were) before one's eyes, to discern both any shortcoming in any features that the definition ought to have, and also any superfluous addition, so that one is better supplied with lines of attack.

[-]cousin_it4mo20

I think the general solution to this problem is to learn to work very fast - faster than your perception can get tired. 5 minute sketches in art, improvising in music. And becoming comfortable with throwing away and redoing stuff instead of tweaking it.

[-]Vale5mo90

There is a tendency for the last 1% to take the longest time.

I wonder if that long last 1% will be before AGI, or ASI, or both.

[-]Kajus5mo10

I don't think many people on lw believe that the last 1% will take the longest time - I believe many would say that the take off is exponential

[-]Vale5mo10

I don't necessarily believe or disbelieve in the final 1% taking the longest in this case – there are too many variables to make a confident prediction. However, it does tend to be a common occurrence.

It could very well be that the 1% before the final 1% takes the longest. Based on the past few years, progress in the AI space has been made fairly steadily, so it could also be that it continues at just this pace until that last 1% is hit, and then exponential takeoff occurs.

You could also have a takeoff event that carries from now till 99%, which is then followed by the final 1% taking a long period.

A typical exponential takeoff is, of course, very possible as well.

[-]Vale3mo30

Many people agree that 'artificial intelligence' is a poor term that is vague and has existing connotations. People use it to refer to a whole range of different technologies.

However, I struggle to come up with any better terminology. If not 'artificial intelligence', what term would be ideal for describing the capabilities of multi-modal tools like Claude, Gemini, and ChatGPT?

[-]wonder3mo60

I also agree "AI" is overloaded and has existing connotations (ranging from algorithms to applications as well)! I would think generative models, or generative AI works better (and one can specify multimodal generative models if one wants to be super clear), but also curious to see what other people would propose.

[-]Vale2mo10

I just saw the term 'Synthetic Intelligence' thrown forward, which I quite like.

https://front-end.social/@heydon/115071424831331716

[-]winstonBosan3mo10

You probably don’t like the term LLM because it doesn’t describe capability. And most model are multimodal these days, so it is not just natural language.

You also wouldn’t like the term Autoregressive/Next-token predictor. Still because it says what it does, not what it is capable of.

AI is a pretty good term. As overloaded as it is.

[-]Vale6mo30

Following news of Anthropic allowing Claude to decide to terminate conversations, I find myself thinking about when Microsoft did the same with the misaligned Sydney in Bing Chat.

[-]Amalthea6mo10

In the Sydney case, this was probably less Sydney ending the conversation and more the conversation being terminated in order to hide Sydney going off the rails.

[-]cubefox6mo30

It was both, in the system prompt the model was instructed to end the conversation if in disagreement with the user. You could also ask it to end the conversation. It would presumably send an end-of-conversation token. Which then made the text box disappear.

[-]Vale6mo30

We have artificial intelligence trained on decades worth of stories about misaligned, maleficent artificial intelligence that attempts violent takeover and world domination.

[-]Vale3mo*10

We talk and think a lot about echo chambers with social media. People view what they're aligned with, which snowballs as algorithms feed them more content of that type, which pushes their views to the extreme.

I wonder how tailor-made AI-generated content will feed into that. It's my thinking and worry that AI systems can produce content perfectly aligned with a user in all ways, creating a flawless self-feeding ideological silo.

[-]Vale4mo10

I was thinking a little bit about the bystander effect in the context of AI safety, alignment, and regulation.

With many independent actors working on and around AI – each operating with safety intentions regarding their own project – is there worrying potential for a collective bystander effect to emerge? Each regulatory body might assume that AI companies, or other regulatory bodies, or the wider AI safety community are sufficiently addressing the overall problems and ensuring collective safety.

This could lead to a situation where no single entity feels the full weight of responsibility for the holistic safety of the global AI ecosystem, resulting in an overall landscape that is flawed, unsafe, and/or dangerous.

[-]Vale6mo10

Predicting AGI/ASI timelines is highly speculative and unviable. Ultimately, there are too many unknowns and complex variables at play. Any timeline must deal with systems and consequences multiple steps out, where tiny initial errors compound dramatically. A range can be somewhat reasonable, a more specific figure less so, and accurately predicting the consequences of the final event when it comes to pass even further improbable. It is simply impractical to come up with an accurate timeline with the knowledge we currently have.

Despite this, timelines are popular – both with the general AI hype crowd and those more informed. People don't seem to penalise incorrect timelines – as evidenced by the many predicted dates we've seen pass without event. Thus, there's little downside to proposing a timeline, even an outrageous one. If it's wrong, it's largely forgotten. If it's right, you're lauded a prophet. The nebulous definitions of "AGI" and "ASI" also offer an out. One can always argue the achieved system doesn't meet their specific definition or point to the AI Effect.

I suppose @gwern's fantastic work on The Scaling Hypothesis is evidence of how an accurate prediction can significantly boost credibility and personal notoriety. Proposing timelines gets attention. Anyone noteworthy with a timeline becomes the centre of discussion, especially if their proposal is on the extremes of the spectrum.

The incentives for making timeline predictions seem heavily weighted towards upside, regardless of the actual predictive power or accuracy. Plenty to gain; not much to lose.

[-]Seth Herd6mo20

Please also consider the consequences of timelines WRT preparing for the arrival of AGI/ASI. From this perspective, accurate predictions of the technology and its consequences are very useful. For raw timelines, rring on the side of shorter seems much more useful in that if people think timelines are shorter they will for the most part be more prepared when it actually happens.

Erring on the side of longer timelines has the potential to be disastrous. People seem to tend toward complacency anyway. Thinking they've got a long time to prepare seems to make it much likelier that we'll be collectively unprepared, and thus quite possibly all die.

This isn't an argument to distort your timelines, just to make sure you're not overestimating, and to emphasize the real possibility of short timelines given the large uncertainty.

[-]Vale6mo10

If many independent actors are working on AI capabilities, even if each team has decent safety intentions within their own project, is there a fundamental coordination problem that makes the overall landscape unsafe? A case where the sum of the whole is flawed, unsafe, and/or dangerous and thus doesn't equal collective safety?

[-]Vale7mo*0-1

I think people seem to downplay that when artificial intelligence companies release new models/features, they tend to do so with minimal guardrails.

I don't think it is hyperbole to suggest this is done for the PR boost gained by spurring online discussion, though it could also just be part of the churn and rush to appear on top where sound guardrails are not considered a necessity. Either way, models tend to become less controversial and more presentable over time.

Recently OpenAI released their GPT-4o image generation with rather relaxed guardrails (it being able to generate political content and images of celebrities without consent). This came hot off the heels of Google's latest Imagen model, so there was reason to rush to market and 'one-up' Google.

Obviously much of AI risk is centred around swift progress and companies prioritising that progress over safety, but minimising safety specifically for the sake of public perception and marketing strikes me as something we are moving closer towards.

This triggers two main thoughts for me:

How far are companies willing to relax their guardrails to beat competitors to market?
Where is 'the line' between a model with relaxed enough guardrails to spur public discussion but not relaxed enough to cause significant damages to the company's perception and wider societal risk?

Moderation Log

LESSWRONG
LW

LESSWRONG
LW

Vale's Shortform

2