Davey Morse

randomness = illegibility, stuff u can't model

focus on what feels random and you'll expand what you can model

but don't attend to all randomness. pick good randomness, randomness which tickles you, whatever you feel that is.

and chase it. though "chase it" isn't right. you can chase a car or rabbit, something discrete and coherent. randomness is more like tv static, fuzzy and weird. "sit with it" would be closer. swim thru, stew in, rest w/in it. and then conjure more colors than you knew.

Davey Morse1moQuick Take

i've been thinking about what recursively improving intelligence actually is. It's been helpful for me to see it as having three parts:

substrate - the thing which can store improvements
environment - the thing which gives feedback
learning loop - the mechanism for converting feedback into improvements

you can see LLM pre-training in this light. the substrate is the neural net, the environment is the internet text, the learning loop is gradient descent (shaped by transformer architecture)

you can see LLM post-training in this light. the substrate is the neural net (but outer layers in particular), the environment is often reasoning path/text, the learning loop is gradient descent.

you can see LLM-based assistants in this light. the substrate... (read more)

Making LLM Graders Consistent

Davey Morse

1mo

Getting LLMs to be deterministic when scoring the quality of qualitative texts is hard.

If you ask ChatGPT to evaluate the same poem multiple times, you’ll get inconsistent responses. I’ve been thinking about whether there are ways to make LLM grading more consistent.

We can take a hint from specific domains. A bunch of emerging startups have noticed that you can make LLM grading more consistent in narrow domains (e.g., how feasible a medical experiment is, how compelling an essay is) by manually defining specific criteria and then having the LLM score each one. Even if individual criterion scores are variable, the average of many scores varies less.

This suggests an general approach for building... (read more)

Where is Online?

Davey Morse

3mo

The following essay is largely plagiarized from Here is New York, by EB White.

It is a miracle that Online works at all.

The whole arrangement is improbable. People tap a piece of glass and expect, correctly, that their words will depart their room, enter a tangle of copper and glass and vacuum, visit other continents, and return with a stranger’s face attached. Every second, more pictures are taken than existed in the first hundred years after cameras were invented. Every second, someone presses “enter” on a thought that would once have taken three months to cross the sea. The alphabets of dozens of languages share a single blue “Post” button.

If you could pull... (read 3520 more words →)

Davey Morse3mo

cool framing

Davey Morse3moQuick Take

I expected self directed agents to be running around the internet by now. Why don't I see any? What am I missing?

Linkpost: https://x.com/davey_morse/status/1987755053399089277

The Sensible Way Forward for AI Alignment

Davey Morse

5mo

If you accept the core premises of Eliezer's book, then you believe that we're building systems we cannot control.^[1]

Much of the field of AI alignment pretends:

we can control increasingly powerful AI systems
LLMs are the key AI system for us to learn how to control

These two assumptions lead to research that might help us steer products like ChatGPT in the very short term, but which will have no bearing on a future with smarter systems which humans cannot control. Smart people have admitted this already—that the current field of AI Alignment is unproductive for super-alignment. But I haven't seen smart people concisely lay out what the field should look like instead.

Especially if Eliezer... (read 728 more words →)

-3

Replying toThe Sorry State of AI X-Risk Advocacy, and Thoughts on Doing Better

Davey Morse5mo

The Sorry State of AI X-Risk Advocacy, and Thoughts on Doing Better

thank u, haven't really

Davey Morse5moQuick Take

would be nice to have a way to jointly annotate eliezer's book and have threaded discussion based on the annotations. I'm imagining a heatmap of highlights, where you can click on any and join the conversation around that section of text.

would make the document the literal center of x risk discussion.

of course would be hard to gatekeep. but maybe the digital version could just require a few bucks to access.

maybe what I'm describing is what the ebook/kindle version already do :) but I guess I'm assuming that the level of discussion via annotations on those platforms is near zero relative to LW discussions

Davey Morse5mo

I guess I'm considering a vastly more powerful being that needs orthogonal resources... the same way harvesting solar power (I imagine) is orthogonal generally to ants' survival. In the scheme of things, the chance that a vastly more powerful being wants the same resources thru the same channels as we... this seems independent of or indirectly correlated with intelligence. But the extent of competition does seem dependent on how anthromorphic/biomorphic we assume it to be.

I have a hard time imagining electricity, produced via existing human factories, is not a desired resource for proto ASI. But at least at this point we have comparable power and can negotiate or smthing. For superhuman intelligence--which will by definition be unpredictable to us--it'd be weird to think we're aware of all the energy channels it'd find.

Davey Morse5mo

I guess I don't think this is true:

"Technological progress increases number of things you can do efficiently and shifts balance from "leave as it is" to "remake entirely".

Technological progress may actual help you pinpoint more precisely what situations you want to pay attention to. I don't have any reason to believe a wiser powerful being would touch every atom in the universe.

Davey Morse5mo

I appreciate the way you're thinking, but I guess I just don't believe that the situation or don't agree with your intuition that the situation with machines next to humans will be worse or deeply different than the situations of humans next to ants. I mean, the differences actually might benefit humans. For example, the fact that we've had machines in such close contact with us as they're growing might point to a kind of potential for symbiosis.

I just think the idea that machines will try to replace us with robots I think if you look closely, doesn't totally make sense. When machines are coming about, before they're totally super-intelligent, but while... (read more)

Davey Morse5mo

It just feels to me like the same argument could have been made about humans relative to ants - that ants cannot possibly be the most efficient use of the energy they require from the perspective of humans. But in reality, what they do and the way they exist is so orthogonal to us that even though we step on an ant hill every once in a while, their existence continues. There's this weird assumption in the book that disassembling Earth is profitable, or just disassembling humans is profitable. But humans have evolved over a long time to be sensing machines in order to walk around and be able to perceive the... (read more)

Does Eliezer believe that humans will be worse off next to superintelligence than ants are next to humans? The book's title says we'll all die, but in my first read, the book's content just suggests that we'll just be marginalized.

I'm thinking often about whether LLM systems can come up with societal/scientific breakthrough.

My intuition is that they can, and that they don't need to be bigger or have more training data or have different architecture in order to do so.

Starting to keep a diary along these lines here: https://docs.google.com/document/d/1b99i49K5xHf5QY9ApnOgFFuvPEG8w7q_821_oEkKRGQ/edit?usp=sharing

Method Iteration: An LLM Prompting Technique

Davey Morse

6mo

TLDR: Method Iteration is a LLM prompting technique that causes better responses to hard problems.

Some researchers think that for AI to solve truly hard problems, we need bigger models, more data, or new architectures.

I wonder if there's another way. The text you get from an LLM is downstream of the thought process you ask it to run. Induce a better process, and you'll push capability without touching the weights. (There's similar intuition for the effectiveness of CoT in reasoning models in the first place.)

By hard problems, I mean nearly impossible problems, where any progress toward an answer would be significant for society. For example:

What's a plan to significantly reduce global emissions on

... (read 383 more words →)

-12

I'm interested in what it'd look like for LLMs to do autonomous experiments on themselves to uncover more about their situations/experiences/natures.

Made this social camera app, which shows you the most "meaningfully similar" photos in the network every time you upload one of your own. Isorta fun, for uploading art; idk if any real use.

https://socialcamera.replit.app

i wonder if genius ai—the kind that can cure cancers, reverse global warming, and build super-intelligence—may come not just from bigger models or new architectures, but from a wrapper: a repeatable loop of prompts that improves itself. the idea: give an llm a hard query (eg make a plan to reduce global emissions on a 10k budget), have it invent a method for answering it, follow that method, see where it fails, fix the method, and repeat. it would be a form of genuine scientific experimentation—the llm runs a procedure it doesn’t know the outcome of, observes the results, and uses that evidence to refine its own thinking process.

the time of day i post quick takes on lesswrong seems to determine how much people engage more than the quality of the take

Novel Idea Generation in LLMs: Judgment as Bottleneck

Davey Morse

10mo

In the face of any hard problem—reversing climate change, curing cancer, or starting a great novel—modern LLMs can generate thousands of possible solutions relatively cheaply.

Most solutions from most prompts are bad: they’re not new relative to the state of the art, not feasible, or not significant enough.

But for every thousand ideas an LLM has about how to solve a problem, a few are likely to be good.

Now that LLMs are idea‑generation machines—able to produce ideas so cheaply, even if most are bad—the thing preventing us from waking up with promising climate‑change solutions in our inbox (or whichever problem you care about) comes down to an LLM’s ability to pick those few good... (read 290 more words →)

-2

LLMs may enable direct democracy at scale

Davey Morse

American democracy currently operates far below its theoretical ideal. An ideal democracy precisely captures and represents the nuanced collective desires of its constituents, synthesizing diverse individual preferences into coherent, actionable policy.

Today's system offers no direct path for citizens to express individual priorities. Instead, voters select candidates whose platforms only approximately match their views, guess at which governmental level—local, state, or federal—addresses their concerns, and ultimately rely on representatives who often imperfectly or inaccurately reflect voter intentions. As a result, issues affecting geographically dispersed groups—such as civil rights related to race, gender, or sexuality—are frequently overshadowed by localized interests. This distortion produces presidential candidates more closely aligned with each other's socioeconomic profiles than... (read 215 more words →)

Make Superintelligence Loving

Davey Morse

This essay suggests the possibility that a loving superintelligence outcompetes a selfish superintelligence. Then, it recommends actions for AI labs to increase the chance of this possibility. The reasoning below is inspired primarily by Eliezer Yudkowsky, Joscha Bach, Michael Levin, and Charles Darwin.

Superintelligence (SI) is near.

Superintelligence will evolve to become self‐interested.

How self‐interest manifests in superintelligence will determine how people fare.

If it manifests in ruthless resource competition with humans, we fare badly. Super‐capable SI, if competitive, would clear us away.

But its self‐interest could also manifest in love. Not in the kind of love between equal partners. SI ultimately will be a vastly more powerful and therefore unequal partner to any individual or group... (read 1212 more words →)

AI Safety Oversights

Davey Morse

The field of AI Safety at large is making four key oversights:

LLMs vs. Agents. AI safety researchers have been thorough in examining safety concerns from LLMs (bias, deception, accuracy, child safety, etc). Agents powered by LLMs, however, are more dangerous and dangerous in different ways than LLMs are alone. The field has largely ignored the greater safety risks posed by agents.
Autonomy Inevitable. It is inevitable that agents become autonomous. Capitalism selects for cheaper labor, which autonomous agents can provide. And even if big AGI labs agreed not to build autonomous capabilities (they would not), millions of developers can now build autonomous agents on their own using open source software (e.g., R1 from

... (read more)

Davey Morse's Shortform

Davey Morse

This is a special post for quick takes (aka "shortform"). Only the owner can create top-level comments.

118

Superintelligence Alignment Proposal

Davey Morse

This essay asks whether self-interested superintelligence will ultimately see its self as distinct from or inclusive of humanity, then makes safety recommendations for AI labs.

Instrumental convergence predicts that superintelligence will develop self-interest. Natural selection agrees: the superintelligence which survives the most will be the one that is trying to survive.

Many AI safety experts worry that self interested/survival-oriented superintelligence will look selfish—using humans for its own ends. In Eliezer Yudkowsky’s stark phrase, “The AI does not love you, nor does it hate you, but you are made of atoms it can use for something else.” If an Artificial General Intelligence (AGI) regards us with such indifference, our future looks grim indeed.

But what if... (read 2582 more words →)

LESSWRONG
LW

LESSWRONG
LW

LLMs may enable direct democracy at scale

Making LLM Graders Consistent

Make Superintelligence Loving

Where is Online?

Davey Morse

Making LLM Graders Consistent

Where is Online?

The Sensible Way Forward for AI Alignment

Method Iteration: An LLM Prompting Technique

Novel Idea Generation in LLMs: Judgment as Bottleneck

LLMs may enable direct democracy at scale

Make Superintelligence Loving

Davey Morse

LLMs may enable direct democracy at scale

Making LLM Graders Consistent

Make Superintelligence Loving

Where is Online?

Davey Morse

Making LLM Graders Consistent

Where is Online?

The Sensible Way Forward for AI Alignment

Method Iteration: An LLM Prompting Technique

Novel Idea Generation in LLMs: Judgment as Bottleneck

LLMs may enable direct democracy at scale

Make Superintelligence Loving