In response to / inspired by this SSC post:
I was originally going to comment something about "how do I balance this with the need to filter for niche nerds who are like me?", but then I remembered that the post is actually literally about dunks/insults on Twitter. o_0
This, in meta- and object-level ways, got to a core problem I have: I want to do smart and nice things with smart and nice people, yet these (especially the social stuff) requires me to be so careful + actually have anything like a self-filter. And even trying to practice/exercise that basic s...
Yep, agreed. I'm just glad that (allegedly?) the LTFF is still doing specifically the upskilling-grant thing.
Bad-case, I get to have to work on harebrained side-business ideas as well, while jobless-and-not-yet-funded (or even while-funded-but-not-for-a-long-runway, possibly?).
Forgot to mention this in the post proper, but: Pages would be organized in a multi-examples-per-subsection way, where each subsection corresponds to something like a part of an extended "ADEPT Method".
Agreed, most "fraudulent" listed public companies (on places like the NYSE, where they actually check stuff), fill weird conditions like:
(Disclaimer: not an expert, not financial advice.)
(sources: discord chats on public servers) Why do I believe X?
What information do I already have, that could be relevant here?
What would have to be true such that X would be a good idea?
If I woke up tomorrow and found a textbook explaining how this problem was solved, what's paragraph 1?
What is the process by which X was selected?
I was texting multiple Discords for a certain type of mental "heuristic", or "mental motion", or "badass thing that Planecrash!Keltham or HPMOR!Harry would think to themselves, in order to guide their own thoughts into fruitful and creative and smart directions". Someone commented that this could also be reframed as "language-model prompts to your own mind" or "language-model simulations of other people in your own mind".
I've decided to clarify what I meant, and why even smart people could benefit from seemingly hokey tricks like this.
Heuristics/LMs-of-oth...
Yep, we can easily have multiple hypotheses of the form "something we don't (yet) understand has caused this". My odds are more on "weather/camera/light/experimental aircraft we don't understand" than "aliens we don't understand".
One problem with "using a simpler example", is that there's a lower bound. Prime numbers are not-too-hard to explain, at some levels of thoroughness.
Like, some part of my subconscious basically thinks (despite evidence to the contrary): "There is Easy Math and Hard Math. All intuitive explanations have been done only about Easy Math. Hard Math is literally impossible to explain if you don't already understand it."
Part of the point of Mathopedia, is to explicitly go after hard, advanced, graduate-level and research-level mathematics. To make them intelligib...
Pretty tangential, feel free to remove:
The YouTube "BarelySociable" did a 2-part video a while back, trying to figure out who Satoshi Nakamoto was. He gave pretty decent evidence it was a British guy who's not any of the 3 candidates mentioned.
Yep! My main hope is that it works in a niche of people who needed specifically-it (or who find it more "intrinsically fun" to read and/or contribute to than the other options).
choose something that is difficult for others but simple for you.
Yep, a broader life lesson I'm still learning haha.
IIRC Paul Graham recommended such a tactic, framing it as "easier gains from moving around in problem-space than solution-space".
And your other recommendations definitely make sense here. In my giant bookmarks folder about the "mathopedia" idea, this post and the comments are bookmarked.
This is "merely" one safety idea... but it's a slam-dunk one that there's not (as far as I can tell) good reason to not do.
Very good points, yeah!
I actually attempted making an example-page in a Wikipedia sandbox, but did not have the energy/deeper-requisite-knowledge for the topic I chose (Godels Incompleteness Theorems ;-;), so I didn't finish it. But I do agree that, if I launched this, I'd need at least one good example-page.
Another part of the problem, which Arbital especially failed at, was getting others to contribute. Reddit and StackOverflow solve this by basically giving people literal "status points" for writing helpful effort-signaling posts. So I'd want some kind ...
These ideas seem promising!
How do you distinguish feeling of epiphany and grokking?
Good point, I haven't really done that here. We could differentiate by e.g. having practice-problems, and people can login to track their progress. Similar to the multi-explanations/teaching-methods setup, there could be a broad variety of example problems --> less likely someone gets lots of them right without actually understanding the concept.
For this incentives-reason, I wish hardcore-technical-AI-alignment had a greater support-infrastructure for independent researchers and students. Otherwise, we're often gonna be torn between "learning/working for something to get a job" and "learning AI alignment background knowledge with our spare time/energy".
Technical AI alignment is one of the few important fields that you can't quite major in, and whose closest-related jobs/majors make the problem worse.
As much as agency is nice, plenty of (useful!) academics out there don't have the kind of agency/ri...
Yeah, I keep finding myself wishing that every other message/communication platform I use, would add Discord-style custom emotes for hyperspecific situations.
I acknowledge this is phrased kinda weirdly. I would say this fits the spirit of the question (albeit as a noncentral example), plus "opting-into reacts as a whole" is required on a by-post basis.
Good point. My basic idea is something like "most interp work makes it more efficient to train/use increasingly-powerful/dangerous models". So I think the two uses of "dangerous" you quote here, both fit with this idea.
Good point, yeah. That very unclarity, itself, contributed to me wasting so much time on that route.
Ah, thanks! LTFF was definitely on my list of things to apply for, I just wasn't sure if that upskilling/trial period was still "a thing" these days. Very glad that it is!
Counterpoint: at least one kind of research, mechanistic interpretability, could very well be both dangerous by helping capabilities and also essential for alignment. My current intuition is that the same could be said of other research avenues.
Yes, there are plenty of dangerous ideas that aren't so coupled with alignment, but they're not the frustrating edge-case I'm writing about. (And, of course, I'm not doing or publishing that type of research.)
Counting some AI safety blessings:
This is kinda like the "liquid democracy" software, but with delegation being automatic instead of, say, thoughtful. I like the idea of seeing only some people's upvotes, but I want to consciously and explicitly choose those people. Not by subscribing to their posts, and definitely not by upvoting something they've written.
I could imagine a toggle-switch next to the upvote counter, labeled "upvotes you trust, only" or something, along with a little link to go whitelist users into your trust-graph.
A related-but-not-the-same idea, would be to add a lot of di...
I think I agree with this regarding inside-group communication, and have now edited the post to add something kind-of-to-this-effect at the top.
This advice is helpful in addition, and yeah, my advice is probably reverse-worthy sometimes (though not like half the time).
A point, as far as I can guess, is something like "the persistent misunderstanding of you, by others PLUS the lack of time/energy/mental-stamina to correct every person who misunderstands you, in an explicit/verbal way EQUALS very-hard-to-escape psychological suffering, even if it's low-grade most of the time".
Like, you can update on this ("I'm an outlier, I'm not like other people"), and it can still hurt. Angst from that, seems difficult to just make "stop happening" from one update.
I have given you an adequate explanation. If you were the kind of person who was good at math, my explanation would have been sufficient, and you would now understand. You still do not understand. Therefore...?
I felt this; I still wonder if not-prioritizing clarity (or even intentionally-being-unclear) is a useful filter for maths/logic ability, outside the costs felt by others.
I think work on the study of abstraction, one way or another, will be essential to AI alignment. Even "just" being able to make very precise high-level predictions of (an AI's behavior FROM its internal state) or (human values FROM measured neurological data), requires enough abstraction-understanding to know whether the simplification is really capturing what we want.
I don't know if the natural abstractions hypothesis is really necessary for this. But something like a more developed/complete version of Wentworth's "minimal maps" representation of abstract...
This is interesting; I'm still looking for my own (I think?) "comparative advantage" in this area. Some mental motions are very easy, while some "trivial" tasks feel harder (or would require me to already be involved full-time, leading to a chicken-and-egg problem).
I agree the highlighted sentence in my article definitely breaks most rules about emphasis fonts (though not underlining!). My excuse is: that one sentence contains the core kernel of my point. The other emphasis marks (when not used for before-main-content notes) are to guide reading the sentences out-loud-in-your-head, and only use italics.
Thank you! Out of curiosity, which parts of this post made it harder to wrap your head around? (If you're not sure, that's also fine.)
But when people outside of the community complain
Except all of that was done, and some people understood, but unfortunately, those people complaining now didn't.
Semirelated: is this referring to me specifically? If not, who/else?
My point wasn't "the community never does the above things", it was "each member of the community should do these things, and more often".
EDIT: You are correct that I should've mentioned more of the prior-art on this, though, especially Human's Guide To Words.
Also, maybe LW should bring back "Sequences reruns" for that inferential distance post.
Agreed, I read at least part of The Checklist Manifesto, and they are super helpful and underrated. Even for processes which seem most amenable to a checklist (clear steps that don't change much), a checklist is still often missing.
Brain implants and/or genetic modification via viral vector for improvement of brain function (e.g. treating intractable depression).
Wait this sounds really cool.
>Can I get this personally?
I wish! Very limited clinical trials only at this point. And that's after like 20 years of needless delay after we had the basic tech and knowledge to do it. Medical research moves so frustratingly slow.
here's a link to a news post about a recent advance: https://www.ucsf.edu/news/2021/09/421541/treating-severe-depression-demand-brain-stimulation
Thanks!
The "It's on me/you" part kinda reminds me of a quote I think of, from a SlateStarCodex post:
This obviously doesn’t absolve the Nazis of any blame, but it sure doesn’t make the rest of the world look very good either.
I was trying to make sense of it, and came up with an interpretation like "The situation doesn't remove blame from Nazis. It simply creates more blame for other people, in addition to the blame for the Nazis."
Likewise, my post doesn't try to absolve the reader of burden-of-understanding. It just creates (well, points-out) more burde...
It was very much "yeah, so, I don't know what I'm trying to say, help?" not a coherent argument. Yeah! I think the community should look into either using "epistemic status" labels to make this sort of thing clearer, or a new type of label (like "trying to be clear" vs "random idea" label).
It's to differentiate from
more-obviously-tractableless-formalism/conceptual/deconfusion-based research agendas like i.e. HCH. As asked, I'm looking for info specifically related to this other kind.