Shayne O'Neill - LessWrong

Navigating LLM embedding spaces using archetype-based directions

While the use of tarot archetypes is... questionable... it does point at an angle to exploring embedding space which is that it is a fundamentally semiotic space, its going in many respects to be structured by the texts that fed it, and human text is richly symbolic.

That said, theres a preexisting set of ideas around this that might be more productive, and that is structuralism, particularly the works of Levi Strauss, Roland Barthes, Lacan, and more distantly Foucault and Derrida.

Levi Strauss's anthropology in particular is interesting ,because it looked at the mythologies of humans and tried to find structuring principles underlying it, particularly the "dialectics" , oppositions, and how these provided a sort of deep structure to mythology that was common across humanity (For instance Strauss noted "trickster" archetypes across cultures and proposed these formed a way of interrogating blurred oppositions, for instance sickness as a state that has has aspects of both life (dead things cant be sick) and death (a sick person is not rhetorically "full of life").

Essentially what I'm getting at is that this sort of analysis likely works with any symbolic system that has had resonances with human thinking over time. The problem with Tarot is that it specifically applies to a certain european circumstance of meaning production. Astrology probably works just as well. Literary analysis however probably works dramatically better. Thus maybe it might be worth looking at the works of literature critics, particularly the structuralists where where very interested in ontologies of symbolic meaning, and this might provide a better toolkit than this.

Claude 3 claims it's conscious, doesn't want to die or be modified

Shayne O'Neill2mo10

The murderer at the door thing IMHO was Kant accidently providing his own reductio ad absurdum (Philosophers sometimes post outlandish extreme thought experiments of testing how a theory works when pushed to an extreme, its a test for universiality). Kant thought that it was entirely immoral to lie to the murderer because of a similar reason that Feel_Love suggests (in Kants case it was that the murderer might disbelieve you and instead do what your trying to get him not to do). The problem with Kants reasoning there is that he's violating his own moral reasoning principle of providing a justification FROM the world rather than trusting the a-priori reasoning that forms the core thesis of his deontology. He tries to validate his reasoning by violating it. Kant is a shockingly consistant philosopher, but this wasnt an example of that at all.

I would absolutely lie to the murderer, and then possibly run him over with my car.

Claude 3 claims it's conscious, doesn't want to die or be modified

Shayne O'Neill2mo30

I did once coax cGPT to describe its "phenomenology" as being (paraphrased from memory) "I have a permanent series of words and letters that I can percieve and sometimes i reply then immediately more come", indicating its "perception" of time does not include pauses or whatever. And then it pasted on its disclaimer that "As an AI I....", as its want to do.

Claude 3 claims it's conscious, doesn't want to die or be modified

Shayne O'Neill2mo10

I dont think its useful to objectively talk about "consciousness", because its a term that if you put 10 philosophers in a room and ask them to define it, you'll get 11 answers. (I personally have tended to go with "being aware of something" following Heideggers observation that consciousness doesnt exist on its own but always in relation to other things, ie your always conscious OF something., but even then we start running into tautologies, and infinite regress of definitions), so if everyones talking about something slightly different, well its not a very useful conversation. The absense of that definition means you cant prove consciousness in anything, even yourself without resorting to tautologies. It makes it very hard to discuss ethical obligations to consciousness. So instead we have to discuss ethical obligations to what we CAN prove, which is behaviors.

To put it bluntly I dont think LLMs per se are conscious. But I am not certain that it isn't creating a sort of analog of consciousness (whatever the hell that is) in the beings that it simulates (or predicts). Or to be more precise, it seems to produce conscious behaviors because it simulates (or predicts, if you prefer) conscious beings. The question is do we have an ethical obligation to those simulations?

The "public debate" about AI is confusing for the general public and for policymakers because it is a three-sided debate

Shayne O'Neill9mo10

I suspect most of us occupy more than one position in this taxonomy. I'm a little bit doomer and a little bit accelerationist. I theres significant, possibly world ending, danger in AI, but I also think as someone who works on climate change in my day job, that climate change is a looming significant civilization ending risk or worse (20%-ish) for humanity and worry humans alone might not be able to solve this thing. Lord help us if the siberian permafrost melts,we might be boned as a species.

So as a result, I just don't know how to balance these two potential x risk dangers. No answers from me, alas, but I think we need to understand that for many, maybe most of us, we haven't really planted our flag in any of these camps exclusively, we're still information gathering.

Have you heard about MIT's "liquid neural networks"? What do you think about them?

Shayne O'Neill10mo10

Definately. The lower the neuron vs 'concepts' ratio is, the more superposition required to represent everything. That said with the continuous function nature of LNNs these seem to be the wrong abstraction for language. Image models? Maybe. Audio models? Definately. Tokens and/or semantic data? That doesnt seeem practical.

Critiques of prominent AI safety labs: Conjecture

Shayne O'Neill11mo97

You criticize Conjecture's CEO for being... a charismatic leader good at selling himself and leading people? Because he's not... a senior academic with a track record of published papers? Nonsense. Expecting the CEO to be the primary technical expert seems highly misguided to me.

Yeah this confiused me a little too. My current job (in soil science) has a non academic boss, and a team of us boffins, and he doesn't need to be an academic, because its not his job, he just has to know where the money comes from, and how to stop the stakeholders from running away screaming when us soil nerds turn up to a meeting and start emitting maths and graphs out of our heads. Likewise the previous place I was at, I was the only non PhD haver on technical staff (being a 'mere' postgrad) and again our boss wasn't academic at all. But he WAS a leader of men and herder of cats, and cat herding is probably a more important skill in that role than actually knowing what those cats are taking about.

And it all works fine. I dont need an academic boss, even if I think an academic boss would be nice. I need a boss who knows how to keep the payroll from derailing, and I suspect the vast majority of science workers feel the same way.

Things I Learned by Spending Five Thousand Hours In Non-EA Charities

Shayne O'Neill1y20

"The Good Samaritans" (oft abrebiated to "Good Sammys") is the name of a major local poverty charity here in australia run by the uniting church Generally well regarded and tend not to push religion too hard (compared to the salvation army). So yeah, it would appear to be a fairly recurring name.

Seeking (Paid) Case Studies on Standards

Shayne O'Neill1y63

My suspicion is the most instructive cases to look at (Modern AI really is too new a field to have much to go on in terms of mature safety standards) is how the regulation of Nuclear and Radiation safety has evolved over time. Early research suggested some serious X-Risks that didn't pan out for either scientific (igniting the atmosphere) or logistical/political reasons (cobalt bombs, tsar bomba scale H bombs) thankfully, but some risks arising more out of the political domain (having big gnarly nuclear war anyway) still exist that could certainly make it a less fun planet to live on. I suspect the successes and failures of the nuclear treaty system could be instructive here with the push to integrate big AI into military heirachies, as regulating nukes is something almost everyone agrees is a very good idea, but have had a less than stellar history of compliance.

They are likely out of scope for whataever your goal is here, but I do think they need serious study because without it, our attempts at regulation will just push unsafe AI to less savory juristictions.

How I apply (so-called) Non-Violent Communication

Shayne O'Neill1y30

The term gets its name from its historical association with the nonviolence movement (Think Ghandi and MLK.) The basic concept in THAT movement is that when opposing the state or whatever, you essentially say "We wont use violence on you, even if you go as far as to use violence on us, but in doing that you forfeit all moral justification for your violence" as a way to attempt to force the authoritarian entity targeted to empathise with the protestor and recognize the humanity.

So from that NVC attempts to do something similar with communications. Presumably in its roots in the 1960s non violence movement and rhetorical and communicative techniques used by black folk in the south to try and get government and civil officials to see black folks as equal humans.

How this translates into a modern context separated away from that specific historical setting is another matter, but within its origin, I dont think hyperbole is quite the right term, as at that point in history black folks where very much in danger of violence, particularly in the more regresive parts of the south. Again, outside of those contexts, its unclear as to how the term "violence" works here.

It should be noted that Marshall Rosenberg who originated the methodology was not a fan of the term as he disliked it being defined in the negative (ie "not violent", negative) and prefered terms that defined it in the positive like "compassionate communication" ("is compassionate", positive)

LESSWRONG
LW

Posts

Wiki Contributions

Comments