Mathematician, alignment researcher, doctor. Reach out to me on Discord and tell me you found my profile on LW if you've got something interesting to say; you have my explicit permission to try to guess my Discord handle if so. You can't find my old abandoned LW account but it's from 2011 and has 280 karma.
A Lorxus Favor is worth (approximately) one labor-day's worth of above-replacement-value specialty labor, given and received in good faith, and used for a goal approximately orthogonal to one's desires, and I like LessWrong because people here will understand me if I say as much.
Apart from that, and the fact that I am under no NDAs, including NDAs whose existence I would have to keep secret or lie about, you'll have to find the rest out yourself.
Sure, but you obviously don't (and can't even in principle) turn that up all the way! The key is to make sure that that mode still exists and that you don't simply amputate and cauterize it.
[2.] maybe one could go faster by trying to more directly cleave to the core philosophical problems.
...
An underemphasized point that I should maybe elaborate more on: a main claim is that there's untapped guidance to be gotten from our partial understanding--at the philosophical level and for the philosophical level. In other words, our preliminary concepts and intuitions and propositions are, I think, already enough that there's a lot of progress to be made by having them talk to each other, so to speak.
OK but what would this even look like?\gen
Toss away anything amenable to testing and direct empirical analysis; it's all too concrete and model-dependent.
Toss away mathsy proofsy approaches; they're all too formalized and over-rigid and can only prove things from starting assumptions we haven't got yet and maybe won't think of in time.
Toss away basically all settled philosophy, too; if there were answers to be had there rather than a few passages which ask correct questions, the Vienna Circle would have solved alignment for us.
What's left? And what causes it to hang together? And what causes it not to vanish up its own ungrounded self-reference?
Clearly academia has some blind spots, but how big? Do I just have a knack for finding ideas that academia hates, or are the blind spots actually enormous?
From someone who left a corner of it: the blindspots could be arbitrarily large as far as I know, because there seemed to me to be no real explicit culture of Hamming questions/metalooking for anything neglected. You worked on something vaguely similar/related to your advisor's work, because otherwise you can't get connections to people who know how to attack the problem.
As my reacts hopefully implied, this is exactly the kind of clarification I needed - thanks!
Like, bro, I'm saying it can't think. That's the tweet. What thinking is, isn't clear, but That thinking is should be presumed, pending a forceful philosophical conceptual replacement!
Sure, but you're not preaching to the choir at that point. So surely the next step in that particular dance is to stick a knife in the crack and twist?
That is -
"OK, buddy:
Here's property P (and if you're good, Q and R and...) that [would have to]/[is/are obviously natural and desirable to]/[is/are pretty clearly a critical part if you want to] characterize 'thought' or 'reasoning' as distinct from whatever it is LLMs do when they read their own notes as part of a new prompt and keep chewing them up and spitting the result back as part of the new prompt for itself to read.
Here's thing T (and if you're good, U and V and...) that an LLM cannot actually do, even in principle, which would be trivially easy for (say) an uploaded (and sane, functional, reasonably intelligent) human H could do, even if H is denied (almost?) all of their previously consolidated memories and just working from some basic procedural memory and whatever Magical thing this 'thinking'/'reasoning' thing is."
And if neither you nor anyone else can do either of those things... maybe it's time to give up and say that this 'thinking'/'reasoning' thing is just philosophically confused? I don't think that that's where we're headed, but I find it important to explicitly acknowledge the possibility; I don't deal in more than one epiphenomenon at a time and I'm partial to Platonism already. So if this 'reasoning' thing isn't meaningfully distinguishable in some observable way from what LLMs do, why shouldn't I simply give in?
> https://www.lesswrong.com/posts/r7nBaKy5Ry3JWhnJT/announcing-iliad-theoretical-ai-alignment-conference#whqf4oJoYbz5szxWc
you didn't invite me so you don't get to have all the nice things, but I did leave several good artifacts and books I recommend lying around. I invite you to make good use of them!
(Minor quibble: I’d be careful about using “should” here, as in “the heart should pump blood”, because “should” is often used in a moral sense. For instance, the COVID-19 spike protein presumably has some function involving sneaking into cells, it “should” do that in the teleological sense, but in the moral sense COVID-19 “should” just die out. I think that ambiguity makes a sentence like “but it might be another thing to say, that the heart should pump blood” sound deeper/more substantive than it is, in this context.
This puts me in mind of what I've been calling "the engineer's 'should'" vs "the strategist's 'should'" vs "the preacher's 'should'". Teleological/mechanistic, systems-predictive, is-ought. Really, these ought to all be different words, but I don't really have a good way to cleanly/concisely express the difference between the first two.
To paraphrase:
Want and have. See and take. Run and chase. Thirst and slake. And if you're thwarted in pursuit of your desire… so what? That's just the way of things, not always getting what you hunger for. The desire itself is still yours, still pure, still real, so long as you don't deny it or seek to snuff it out.
@habryka Forgot to comment on the changes you implemented for soundscape at LH during the mixer - possibly you may want to put a speaker in the Bayes window overlooking the courtyard firepit. People started congregating/pooling there (and notably not at the other firepit next to it!) because it was the locally-quietest location, and then the usual failure modes of an attempted 12-person conversation ensued.
any finite-entropy function
Uh...
(h/t to @WhatsTrueKittycat for spotlighting this for me!)
I'm gonna leave my thoughts on the ramifications for academia, where a major career step is to repeatedly join and leave different large bureaucratic organizations for a decade, as an exercise to the reader.
I have numerous thoughts on how Lorxusverse Polity handles this problem but none of it is well-worked out enough to share. In sum though: Probably cybernetics (in the Beer sense) got discovered way earlier and actually ever used as stated and that was that, no particular need for dominance-status as glue or desire for it as social-good. (We'd be way less social overall, though, too, and less likely to make complex enduring social arrangements. There would be careful Polity-wide projects for improving social-contact and social-nutrition. They would be costly and weird. Whether that's good or bad on net, I can't say.)