Mir

In the day I would be reminded of those men and women,
Brave, setting up signals across vast distances,
Considering a nameless way of living, of almost unimagined values.


Flowers are selective about what kind of pollinator they attract. Diurnal flowers use diverse colours to stand out in a competition against their neighbours for visual salience. But flowers with nocturnal anthesis are generally white, as they aim only to outshine the night.

Posts

Sorted by New

Wiki Contributions

Comments

Mir8mo30

This is among the top questions you ought to accumulate insights on if you're trying to do something difficult.

I would advise primarily focusing on how to learn more from yourself as opposed to learning from others, but still, here's what I think:

I. Strict confusion

Seek to find people who seem to be doing something dumb or crazy, and for whom the feeling you get when you try to understand them is not "I'm familiar with how someone could end up believing this" but instead "I've got no idea how they ended up there, but that's just absurd". If someone believes something wild, and your response is strict confusion, that's high value of information. You can only safely say they're low-epistemic-value if you have evidence for some alternative story that explains why they believe what they believe.

II. Surprisingly popular

Alternatively, find something that is surprisingly popular—because if you don't understand why someone believes something, you cannot exclude that they believe it for good reasons.

The meta-trick to extracting wisdom from society's noisy chatter is learn to understand what drives people's beliefs in general; then, if your model fails to predict why someone believes something, you can either learn something about human behaviour, or about whatever evidence you don't have yet.

III. Sensitivity >> specificity

It's easy to relinquish old beliefs if you are ever-optimistic that you'll find better ideas than whatever you have now. If you look back at what you wrote a year ago, and think "huh, that guy really had it all figured out," you should be suspicious that you've stagnated. Strive to be embarrassed of your past world-model—it implies progress.

So trust your mind that it'll adapt to new evidence, and tune your sensitivity up as high as the capacity of your discriminator allows. False-positives are usually harmless and quick to relinquish—and if they aren't, then believing something false for as long as it takes for you to find the counter-argument is a really good way to discover general weaknesses in your epistemic filters.[1] You can't upgrade your immune-system without exposing yourself to infection every now and then. Another frame on this:

I was being silly!  If the hotel was ahead of me, I'd get there fastest if I kept going 60mph.  And if the hotel was behind me, I'd get there fastest by heading at 60 miles per hour in the other direction.  And if I wasn't going to turn around yet … my best bet given the uncertainty was to check N more miles of highway first, before I turned around.
 — The correct response to uncertainty is *not* half-speed — LessWrong

IV. Vingean deference limits

The problem is that if you select people cautiously, you miss out on hiring people significantly more competent than you. The people who are much higher competence will behave in ways you don't recognise as more competent. If you were able to tell what right things to do are, you would just do those things and be at their level. Innovation on the frontier is anti-inductive.

If good research is heavy-tailed & in a positive selection-regime, then cautiousness actively selects against features with the highest expected value.
 — helplessly wasting time on the forum

Vingean deference limit

finding ppl who are truly on the right side of this graph is hard bc it's easy to mis-see large divergence as craziness. lesson: only infer ppl's competence by the process they use, ~never by their object-level opinions. u can ~only learn from ppl who diverge from u.
 — some bird

V. Confusion implies VoI, not stupidity

look for epistemic caves wherefrom survivors return confused or "obviously misguided".
 — ravens can in fact talk btw

  1. ^

    Here assuming that investing credence in the mistaken belief increased your sensitivity to finding its counterargument. For people who are still at a level where credence begets credence, this could be bad advice.

    VI. Epistemic surface area / epistemic net / wind-wane models / some better metaphor

    Every model you have internalised as truly part of you—however true or false—increases your ability to notice when evidence supports or conflicts with it. As long as you place your flag somewhere to begin with, the winds of evidence will start pushing it in the right direction. If your wariness re believing something verifiably false prevents you from making an epistemic income, consider what you're really optimising for. Beliefs pay rent in anticipated experiences, regardless of whether they are correct in the end.

Answer by MirSep 16, 202310

i googled it just now bc i wanted to find a wikipedia article i read ~9 years ago mentioning "deconcentration of attention", and this LW post came up. odd.

anyway, i first found mention of it via a blue-link on the page for Ithkuil. they've since changed smth, but this snippet remains:

After a mention of Ithkuil in the Russian magazine Computerra, several speakers of Russian contacted Quijada and expressed enthusiasm to learn Ithkuil for its application to psychonetics—

deconcentration of attention

i wanted to look it up bc it relates to smth i tweeted abt yesterday:

unique how the pattern is only visible when you don't look at it. i wonder what other kind of stuff is like that. like, maybe a life-problem that's only visible to intuition, and if you try to zoom in to rationally understand it, you find there's no problem after all?

oh.

Image

i notice that relaxing my attention sometimes works when eg i'm trying to recall smth at the limit of my memory (or when it's stuck on my tongue). sorta like broadening my attentional field to connect widely distributed patterns. another frame on it is that it enables anabranching trains of thought. (ht TsviBT for the word & concept)

An anabranch is a section of a river or stream that diverts from the main channel or stem of the watercourse and rejoins the main stem downstream.

here's my model for why it works:

(update: i no longer endorse this model; i think the whole framework of serial loops is bad, and think everything can be explained without it. still, there are parts of the below explanation that don't depend on it, and it was a productive mistake to make.)

  1. Working Memory is a loop of information (parts of the chewbacca-loop is tentatively my prime suspect for this). it's likely not a fully synchronised clock-cycle, but my guess is that whenever you combine two concepts in WM, their corresponding neural ensembles undergo harmonic locking to remain there.[1]
  2. every iteration, information in the loop is a weighted combination of:
    1. stuff that's already in working memory
    2. new stuff (eg memories) that reaches salience due to sufficient association with stuff from the previous iteration of WM
    3. new stuff from sensory networks (eg sights, sounds) that wasn't automatically filtered out by top-down predictions
  3. for new information (B or C) to get into the loop, it has to exceed a bottom-up threshold for salience.
  4. the salience network (pictured below) determines the weighting between the channels (A, B, C), and/or the height of their respective salience thresholds. (both are ways to achieve the same thing, and i'm unsure which frame is more better.)
  5. "concentrating hard" on trying to recall smth has the effect of silencing the flow of information from B & C, such that the remaining salience is normalised exclusively over stuff in A. iow, it narrows the flow of new information into WM.
    1. (bonus point: this is what "top-down attention" is. it's not "reach-out-and-grab" as it may intuitively feel like. instead, it's a process where the present weighted combination of items in WM determines (allocates/partitions) salience between items in WM.)
  6. this is a tradeoff, however. if you narrow all salience towards eg a specific top-down query , this has smth like the following two effects:
    1. you make it easier to detect potential answers  by reducing the weight of unrelated competing noise
    2. but you also heighten the salience threshold  must exceed to reach you
The salience network is theorized to mediate switching between the default mode network and central executive network.

in light of this, here some tentative takeaways:

  1. if your WM already contains sufficient information to triangulate towards the item you're looking for, and the recollection/insight is bottlenecked by competing noise, concentrate harder. 
  2. but if WM doesn't have sufficient information, concentrating could prematurely block essential cues that don't yet strongly associate from  directly.
  3. and in cases where features in  itself are temporarily interfering w the recollection, globally narrowing or broadening concentration may not unblock it. instead, consider pausing for a bit and try to find alternative ways to ask .

Ithkuil

Natural languages are adequate, but that doesn't mean they're optimal.
 — John Quijada

i'm a fan of Quijada (eg this lecture) and his intensely modular & cognitive-linguistics-inspired conlang, Ithkuil.

that said, i don't think it sufficiently captures the essences of what enables language to be an efficient tool for thought. LW has a wealth of knowledge about that in particular, so i'm sad conlanging (and linguistics in general) hasn't received more attention here. it may not be that hard, EMH doesn't apply when ~nobody's tried.

We can think of a bunch of ideas that we like, and then check whether [our language can adequately] express each idea. We will almost always find that [it is]. To conclude from this that we have an adequate [language] in general, would [be silly].
 — The possible shared Craft of Deliberate Lexicogenesis (freely interpreted)

  1. ^

    Furthermore, a relationship with task performance was evident, indicating that an increased occurrence of harmonic locking (i.e., transient 2:1 ratios) was associated with improved arithmetic performance. These results are in line with previous evidence pointing to the importance of alpha–theta interactions in tasks requiring working memory and executive control. (Julio & Kaat, 2019)

Mir8mo233

when making new words, i try to follow this principle:

label concepts such that the label has high association w situations in which you want the concept to trigger.[1]

the usefwlness of a label can be measured on multiple fronts:

  1. how easy is it to recall (or regenerate):
    1. the label just fm thinking abt the concept?
      1. low-priority, since you already have the concept.
    2. the concept just fm seeing the label?
      1. mid-priority, since this is easy to practice.[2]
    3. the label fm situations where recalling the concept has utility?
      1. high-priority, since this is the only reason to bother making the label in the first place.

if you're optimising for b, you might label your concept "distributed boiling-frog attack" (DBFA). someone cud prob generate the whole idea fm those words alone, so it scores on highly on the criterion.

it scores poorly on c, however. if i'm in a situation in which it is helpfwl for me to notice that someone or something is DBFAing me, there are few semiotic/associative paths fm what i notice now to the label itself.

if i reflect on what kinds of situations i want this thought to reappear in, i think of something like "something is consistently going wrong w a complex system and i'm not sure why but it smells like a targeted hostile force".

maybe i'd call that the "invisible hand of malice" or "inimicus ex machina".

i rly liked the post btw! thanks!

  1. ^

    i happen to call this "symptomatic nymation" in my notes, bc it's about deriving new word from the effects/symptoms of the referent concept/phenomenon. a good label shud be a solution looking for a problem.

  2. ^

    deriving concept fm label is high-priority if you want the concept to gain popularity, however. i usually jst make words for myself and use them in my notes, so i don't hv to worry abt this.

Mir8mo62

here's the non-quantified meaning in terms of wh-movement from right to left:

for conlanging, i like this set of principles:

  1. minimise total visual distance between operators and their arguments
  2. minimise total novelty/complexity/size of all items the reader is forced to store in memory while parsing
    1. every argument in memory shud find its operator asap, and vice versa
    2. some items are fairly easy to store in memory (aka active context)
      1. like the identity of the person writing this comment (me)
      2. or the topic of the post i'm currently commenting on (clever ways to weave credences into language)
    3. other items are fairly hard
      1. often the case in mathy language, bc several complex-and-specific-and-novel items are defined at the outset, and are those items are not given intuitive anaphora.
      2. another way to use sentence-structure to offload memory-work is by writing hierarchical lists like this, so you can quickly switch gaze btn ii., c, and 2—allowing me to leverage the hierarchy anaphorically.

so to quantify sentence , i prefer ur suggestion "I think it'll rain tomorrow". the percentage is supposed to modify "I think" anyway, so it makes more sense to make them adjacent. it's just more work bc it's novel syntax, but that's temporary.

otoh, if we're specifying that subscripts are only used for credences anyway, there's no reason for us to invoke the redundant "I think" image. instead, write

it'll rain tomorrow

in fact, the whole circumfix operator is gratuitously verbose![1] just write:

rain tomorrow

  1. ^

    natlangs smh…[2]

  2. ^

    tbh i wish we had an editor w optimised LaTeX keyboard shortcuts so we cud effortlessly use  wherever we fancy.[3]

  3. ^

    additionally, we should just make much more use the dimensionality afforded to us by the editors we have.

    it's free cheap  real-estate.

Mir2d51

I gave it a try two years ago, and I rly liked the logic lectures early on (basicly a narrativization of HAE101 (for beginners)), but gave up soon after.  here are some other parts I lurned valuable stuff fm:

  • when Keltham said "I do not aspire to be weak."
  • and from an excerpt he tweeted (idk context):

    "if at any point you're calculating how to pessimize a utility function, you're doing it wrong."
    Image
     
  • Keltham briefly talks about the danger of (what I call) "proportional rewards".  I seem to not hv noted down where in the book I read it, but it inspired this note:
    • If you're evaluated for whether you're doing your best, you have an incentive to (subconsciously or otherwise) be weaker so you can fake doing your best with less effort. Never encourage people "you did your best!". An objective output metric may be fairer all things considered.
    • and furthermore caused me to try harder to eliminate internal excusification-loops in my head.  "never make excuses for myself" is my ~3rd Law—and Keltham help me be hyperaware of it.
      • (unrelatedly, my 1st Law is "never make decisions, only ever execute strategies" (origin).)
    • I already had extensive notes on this theme, originally inspired by "Stuck In The Middle With Bruce" (JF Rizzo), but Keltham made me revisit it and update my behaviour further.
    • re "handicap incentives", "moralization of effort", "excuses to lose", "incentive to hedge your bets"
  • I also hv this quoted in my notes, though only to use as diversity/spice for explaining stuff I already had in there (I've placed it under the idionym "tilling the epistemic soil"):
    • Keltham > "I'm - actually running into a small stumbling block about trying to explain mentally why it's better to give wrong answers than no answers? It feels too obvious to explain? I mean, I vaguely remember being told about experiments where, if you don't do that, people sort of revise history inside their own heads, and aren't aware of the processes inside themselves that would have produced the previous wrong or suboptimal answer. If you don't make people notice they're confused, they'll go back and revise history and think that the way they already thought would've handled the questions perfectly fine."

do u have recommendations for other sections u found especially insightfwl or high potential-to-improve-effectiveness?  no need to explain, but link is appreciated so I can tk look wo reading whole thing.

Mir5d10

some metabolic pathways cannot be done at the same time

Have you updated on this since you made this comment (I ask to check whether I should invest in doing a search)? If not, do you now recall any specific examples?

Mir18d30

Edit: I found the post usefwl, thankmuch!!

Mh, was gonna ask when you were taking it.  I'm preparing to try it as a sleep-aid for when I adjust my polyphasic sleep-schedule (wanting to go fm 16h-cycles potentially down to 9h) bc it seems potentially drowsymaking and has much faster plasma decay-rate[1] compared to alts.  This is good for polyphasic if not want drowsy aft wake.

The data in [1] concerns 100mg tablets, however, and a larger dose (eg 400mg) may be longer. The kinetic model[2] they use will prob be good estimate of plasma concentrations even if adjust dose.

Questions is whether it's good estimation for duration of action in the brain, esp given that it's "single-compartment model" (the blood is one compartment, and the brain is another). My heuristic for whether plasma T predicts brain T is whether the molecule v easily passes the BBB (as melatonin does), since then I can guess that the curve for the brain will look similar to the curve for the blood, offset slightly down and to the right.
 

  1. ^

    Typical concentration-time curve of plasma ʟ-theanine of one participant after intake of 100 mg ʟ-theanine via one capsule (A) or 250 mL green tea (B). Circles represent measured concentrations of ʟ-theanine. The line represents the modeled plasma concentration-time curve by the use of the 1-compartment model.
    Kinetics of ʟ-Theanine Uptake and Metabolism in Healthy Participants Are Comparable after Ingestion of ʟ-Theanine via Capsules and Green Tea, , 4 - ScienceDirect

  2. ^

    Hot tip, you can screenshot pdf equations (i rec Text Grab or Greenshot) and ask gpt-4-turbo (updated) to write it in latex and/or jupyter-notebook-compatible python[3].

  3. ^

    Getting a tweakable python model for it was nontrivial after 35m of trying, so i'll prob j wing it w 200mg 1h pre bedtime and do 3h sleep, and adjust bon feelings.  ig the primary variable that determines how alert i feel upon waking up is getting the sleep-cycle timing right, and planning my wake-up routines (lights & text-to-speech-model as alarm) for j after I've done REM-sleep[4].

  4. ^

    While I probably would feel alert if waking up in middle of REM-sleep (after emerging from deep), I want to avoid that bc studies show bad effects from targeted deprivation of REM (leaving other phases untouched).

Mir2mo-3-10

API requests should be automatically screened for human intent, and requests judged by the model to be disrespectfwl should be denied. (And they shouldn't be trained to agree to respond to everything.)

I appreciate the post, but I also wish to hear more detailed and realistic scenarios of exactly how we might end up accidentally (or intentionally) sleepwalk into a moral catastrophe. I think it's unlikely that punishment walls will make AIs more productive, but similar things may profitable/popular if advertised for human (sadist) entertainment.

Mir3mo10

this is rly good.  summary of what i lurned:

  • assume the ravens call a particular pattern iff it rains the next day. 
    • iow, , thus observing the raven's call is strong evidence u ought to hv an umbrella rdy for tomorrow.
    • "raven's call" is therefore a v good predictive var.
  • but bribing the ravens to hush still might not hv any effect on whether it actually rains tomorrow.
    • it's therefore a v bad causal var.
  • it cud even be the case that, up until now, it never not rained unless the raven's called, and intervening on the var cud still be fruitless if nobody's ever done that bfr.
  • for systems u hv 100% accurate & 100% complete predictive maps of, u may still hv a terrible causal map wrt what happens if u try to intervene in ways that take the state of the system out of the distribution u'v been mapping it in.

how then do u build good causal maps?

  • ig u can still improve ur causal maps wo trial-and-error (empirically testing interventions) if u just do predictive mapping of the system, and u focus in on the predictive power of the simplest vars, and do trial-and-error in ur simulations.  or smth.
Mir4mo00

u'r encouraged to write it!

You have permission to steal my work & clone my generating function. Liberate my vision from its original prison. Obsolescence is victory. I yearn to be surpassed. Don't credit me if it's more efficient or better aesthetics to not. Forget my name before letting it be dead weight.

Load More