phoenix — LessWrong

LESSWRONG
LW

phoenix — LessWrong

Accesible/capable AI is also why teachers are going to have to stop grading on "getting the right answer" and start incorporating more "show your reasoning" questions in exams without access to AI. Education will have to adapt to this new technology like it's had to adapt to all new technology.

To be honest, done correctly this may actually be a net positive if we stop optimizing learners only for correct answers and instead focus on the actual process of learning.

I could see a class where the students were encouraged to explore a topic with AI and have to submit their transcript as part of the assignment; their prompts could then be reviewed (along with the AI answers to verify that there weren't any mistakes that snuck in). Could give a lot of insight into the way a student approached the topic and show where their gaps are. Not saying this is the ultimate solution, but it does seem better than throwing up one's hands in resignation.

phoenix2moQuick Take

Kicking this around for a post I'm drafting: when an LLM hallucinates something, it's usually at least plausible for the situation. Like a hallucinated citation generally has proper formatting, etc, so the generation worked well enough. It's also confidently incorrect, which is of course what makes it so dangerous to people who don't know any better and so annoying to people who actually know the subject matter.

I've been thinking of the set of all possible responses as a kind of navigable topology (think like the Library of Babel website but instead of linear pages it's a high-dimensional manifold), and it's been productive to think of hallucination as a kind of localization problem.... (read more)

Replying toStill Too Soon

phoenix2mo

Still Too Soon

My heart goes out to you and everyone who was touched by Sammie. She looks charmingly adorable and clearly had a life well-lived and well-loved. I think you've honored her here, and I hope you know that even now Sammie is still bringing that light to the world through you. Thank you for sharing her story.

Replying toSystems of Control

phoenix2mo

Systems of Control

I want to start off by thanking you for taking the time to read through the post and comment; I know it's not a short one.

I read through the articles you linked, and I'll respond to them in reverse order here:

Responding to "Humans Who Are Not Concentrating Are Not General Intelligences": the ideas in this essay predate my interactions with LLMs and AI in general. I have lived under authoritarian systems and been subject to the dynamics I discuss here. This is not a case of "I had an idea and GPT spun off an essay," this is an attempt to formalize what it feels like within the systems I describe.

That said,... (read 356 more words →)

phoenix2mo

I think that there's absolutely something to having different "modes" of yourself that you can occupy, like different archetypes that you have access to depending on your current needs and environment, but each of them are still 'you.' It's like looking at a light through stained glass; the glass in front can change but there's always the same light shining through.

phoenix's Shortform

phoenix

2mo

This is a special post for quick takes (aka "shortform"). Only the owner can create top-level comments.

phoenix2moQuick Take

Playing with the idea that identity is less of an "instantaneous I" of current experience and more like the continuity of experiential snapshots under the curve. Like how no individual frame is "the movie," but when you run them at 24 frames per second you get the experience of a film that emerges from the continuity.

phoenix2mo

Looks like a fascinating setup, and essentially the deck order is more or less a seed for the game state. Slay the Spire does something similar with the Daily Run which allows you to compare your run directly against other players who had the exact same setup. I take it there would be some kind of central ledger of starting seeds where the scores would be recorded?

Reading through the rules there is a slight point of variation in that if you've gone through the trouble to have a starting seed you might want to also fix the starting player if that's important to your comparisons. (Actually, reading through it again there's the line about choosing who goes first and then a couple lines down "All players take turns simultaneously." so I'm not actually sure how relevant the starting order is, if at all.) Still working out exactly how the dynamics of the game work from the rules text but I'd absolutely try it out if given the chance.

Systems of Control

phoenix

2mo

Most writing on authoritarian systems treats them solely as moral failures: bad people doing bad things. This post takes a different approach: systems of control are engineering failures, and the failure modes are predictable from structure alone without requiring (or dismissing) moral interpretation.

The core claim: systems of control are structurally distinct from simple domination. Domination constrains behavior through force; control constrains reasoning about the constraint. The distinguishing feature: in systems of control, hallucination is a direct consequence of mandating action while forbidding examination of the mandate itself. ^[1]

Hallucination serves as both a primary tool for frame distortion and a critical systemic liability. It induces vulnerabilities at both the system... (read 6466 more words →)

Replying toTrust is Neither Scalar Nor a Snapshot

phoenix2mo

Trust is Neither Scalar Nor a Snapshot

That's a good take: treating trust as “some kind of structured uncertainty object over futures” is very close to what I was gesturing toward because a bare scalar clearly isn’t sufficient.

On reflection, I have to admit I was using “trust” a bit loosely in the post. What it's become clear to me I’m really trying to model isn’t trust in the common usage sense (intentions, warmth, etc.), but something structural: roughly, how stable someone’s behavior is under visible strain, and who tends to bear the cost when things get hard. In my head it’s closer to a relational stability/reliability profile than trust per se, but trust had been the mental shorthand I... (read more)

Trust is Neither Scalar Nor a Snapshot

phoenix

2mo

I've been thinking a lot about trust recently, and about how we as people experience trust and how we represent trust digitally are radically different.

Digital trust tends to be binary or scalar out of necessity: you either have access or you don't. Sometimes you get gradients, where we grant elevated permissions to certain individuals. Roles are easy to assign, but need to be reviewed and more often than not end up as static badges.

Trust in real life works differently. There are people I'd trust with a personal secret that I wouldn't take financial advice from. I might let you work on my car but wouldn't let you babysit. I know I can... (read 751 more words →)

phoenix2mo

I'm not at all surprised by the assertion that humans share values with animals. When you consider that selective pressures act on all systems (which is to say that every living system has to engage with the core constraints of visibility, cost, memory, and strain), it's not much of a leap to conclude that there would be shared attractor basins where values converge over evolutionary timescales.