All of bayesed's Comments + Replies

But if that's the case, he could simply mention the amount he's willing to bet. The phrasing kinda suggested to me that he doesn't have all the info needed to do the Kelly calculation yet.

Bayesed is correct. There's a whole bunch of factors which affect whether a bet is worthwhile.

What do you mean when you say you're "willing to bet according to the Kelly Criterion"? If you're proposing a bet with 99% odds and your actual belief that you'll win the bet is also 99%, then the Kelly Criterion would advise against betting (since the EV of such a bet would be zero, i.e. merely neutral).

Perhaps you mean that the other person should come up with the odds, and then you'll determine your bet amount using the Kelly criterion, assuming a 99% probability of winning for yourself.

Most bets I see are on the order of $10-$1000 which, according to the Kelly Criterion, implies negligible confidence. I'm willing to bet substantially more than that. If we had a real prediction market with proper derivatives, low fees, high liquidity, reputable oracles, etcetera, then I'd just use the standard exchange, but we don't. Consequently, market friction vastly outweighs actual probabilities in importance. Bingo. This is exactly what I mean. Thank you for clarifying. It is important to note that "probability of winning" is not the same as "probability of getting paid, and thus profiting". It's the latter that I care about.
Obviously he thinks the chances are lower than 1% that this is true, if he's willing to bet at 99% odds.

Yeah, based on EY's previous tweets regarding this, it seemed like it was supposed to be a TED talk.

Example origin scenario of this Nash equilibrium from GPT-4:

In this hypothetical scenario, let's imagine that the prisoners are all part of a research experiment on group dynamics and cooperation. Prisoners come from different factions that have a history of rivalry and distrust. 

Initially, each prisoner sets their dial to 30 degrees Celsius, creating a comfortable environment. However, due to the existing distrust and rivalry, some prisoners suspect that deviations from the norm—whether upward or downward—could be a secret signal from one faction to ... (read more)

This makes a lot of sense, but it also gives sufficient motivation to enforce a standard that the modeling of utility and mention of Nash equilibria becomes secondary.   Trying to formalize it by identifying the the sizes of the coalitions, the utility of preventing this communication/conspiracy channel among the rivals, and the actual communications which let them "decide to adopt a strategy of punishing any deviations from the new 99 degrees Celsius temperature" will likely break it again.  In fact, if there is any coalition of at least 2, they can get a minmax result of 98.6, so presumably won't participate nor enforce 99 as an equilibrium.

Hmm, I've not seen people refer to (ChatGPT + Code execution plugin) as an LLM. IMO, an LLM is supposed to be language model consisting of just a neural network with a large number of parameters.

1Jonathan Marcus5mo
I think your definition of LLM is the common one. For example, is on the front page right now, and it uses LLM to refer to a big neural net, in a transformer topology, trained with a lot of data. This is how I was intending to use it as well. Note the difference between "language model" as Christopher King used it, and "large language model" as I am. I plan to keep using LLM for now, especially as GPT refers to OpenAI's product and not the general class of things.

I'm a bit confused about this post. Are you saying it is theoretically impossible to create an LLM that can do 3*3 matrix multiplication without using chain of thought? That seems false. 

The amount of computation an LLM has done so far will be a function of both the size of the LLM (call it the s factor) and the number of tokens generate so far (n). Let's say matrix multiplication of n*n matrices requires cn^3 amount of computation (actually, there are more efficient algos, but it doesn't matter).

You can do this by either using a small LLM and n^3 tok... (read more)

2Jonathan Marcus5mo
Okay, you raise a very good point. To analogize to my own brain: it's like noticing that I can multiply integers 1-20 in my head in one step, but for larger numbers I need to write it out. Does that mean that my neural net can do multiplication? Well, as you say, it depends on n. Nitpick: for SHA1 (and any other cryptographic hash functions) I can't fathom how an LLM could learn it through SGD, as opposed to being hand coded. To do SHA1 correctly you need to implement its internals correctly; being off a little bit will result in a completely incorrect output. It's all or nothing, so there's no way to gradually approach getting it right, so there's no gradient to descend. But your overall point still stands. It is theoretically possible for a transformer to learn any function, so this is not a fundamental upper bound, and you are therefore correct that a large enough LLM could do any of these for small n. I wonder if this is going to be a real capability of SOTA LLMs, or will it be one of those "possible in theory, but many orders of magnitude off in practice" things. Ultimately the question I'm thinking towards is whether an LLM could do the truly important/scary problems. I care less whether an LLM can multiply two 1Mx1M (or 3x3) matrices, and more whether it can devise & execute a 50-step plan for world domination, or make important new discoveries in nanotech, or make billions of dollars in financial markets, etc.  I don't know how to evaluate the computational complexity of these hard problems. I also don't know whether exploring that question would help the capabilities side more than the alignment side, so I need to think carefully before answering.
1Mario Schlosser5mo
Agree. If GPT-4 can solve 3-dim matrix multiplication with chain-of-thought, then doesn't that mean you could just take the last layer's output (before you generate a single token from it) and send it into other instances of GPT-4, and then chain together their output? That should by definition by enough "internal state-keeping" that you wouldn't need it to do the "note-keeping" of chain-of-thought. And that's precisely bayesed's point - because from the outside, that kind of a construct would just look like a bigger LLM. I think this is a clever post, but the bottleneck-ing created by token generation is too arbitrary of a way to assess LLM complexity.
3Jonathan Marcus5mo
Thanks for the deep and thoughtful comment. I hadn't considered that "In general, you can always get a bigger and bigger constant factor to solve problems with higher n." I'm going to think carefully and then see how this changes my thinking. I'll try to reply again soon.

I don't think GPT4 can be used with plugins in ChatGPT. It seems to be a different model, probably based on GPT3.5 (evidence: the color of the icon is green, not black; seems faster than GPT4; no limits or quota; no explicit mention of GPT4 anywhere in announcement).

So I think there's a good chance the title is wrong.

Additional comments on creative mode by Mikhail (from today):

We will {...increase the speed of creative mode...}, but it probably always be somewhat slower, by definition: it generates longer responses, has larger context.

Our current thinking is to keep maximum quality in Creative, which means slower speed.

Our current thinking about Bing Chat modes: Balanced: best for the most common tasks,

... (read more)

Based on Mikhail's Twitter comments, 'precise' and 'creative' don't seem to be too much more than simply the 'temperature' hyperparameter for sampling. 'Precise' would presumably correspond to very low, near-zero or zero, highly deterministic samples.

Nope, Mikhail has said the opposite:

Nope, the temperature is (roughly) the same.

So I'd guess the main difference is in the prompt.

That's interesting. Earlier, he was very explicitly identifying temperature with creativity in the Tweets I collated when commenting about how the controls worked. So now if the temperature is identical but he's calling whatever it is 'creative', he's completely flipped his position on "hallucinations = creativity", apparently. Hm. So it's the same temperature, but it's more expensive, which has 'longer output, more expressive, slower', requires more context... That could point to it being a different model under the hood. But it could also point to a different approach entirely, like implementing best-of sampling, or perhaps some inner-monologue-like approach like a hidden prompt generating several options and then another prompt to pick "the most creative" one. There were some earlier comments about Sydney possibly having a hidden inner-monologue scratchpad/buffer where it could do a bunch of outputs before returning only 1 visible answer to the user. (This could be parallelized if you generated the n suggestions in parallel and didn't mind the possible redundancy, but is inherently still more serial steps than simply generating 1 answer immediately.) This could be 'pick the most creative one' for creative mode, or 'pick the most correct one' for 'precise' mode, etc. So this wouldn't necessarily be anything new and could have been iterated very quickly (but as he says, it'd be inherently slower, generate longer responses, and be more expensive, and be hard to optimize much more). This is something you could try to replicate with ChatGPT/GPT-4. Ask it to generate several different answers to the Monty Fall problem, and then ask it for the most correct one.
Additional comments on creative mode by Mikhail (from today): So creative mode definitely has larger context size, and might also be a larger model?

I think it's more of a correction than a misunderstanding. It shouldn't be assumed that "value" just means human civilization and its potential. Most people reading this post will assume "wiping out all value" to mean wiping out all that we value, not just wiping out humanity. But this is clearly not true, as most people value life and sentience in general, so a universe where all alien civs also end up dying due to our ASI is far worse than the one where there are survivors.

Sure; though what I imagine is more "Human ASI destroys all human value and spreads until it hits defended borders of alien ASI that has also destroyed all alien value..." (Though I don't think this is the case. The sun is still there, so I doubt alien ASI exists. The universe isn't that young.)

Minor (?) correction: You've mentioned multiple times that our ASI will wipe out all value in the universe, but that's very unlikely to happen. We won't be the only (or the first) civilization to have created ASI, so eventually our ASI will run into other rogue/aligned ASIs and be forced to negotiate.

Relevant EY tweets:

People who value life and sentience, and think sanely, know that the future galaxies are the real value at risk.


Yes, I mean that I expect AGI ruin to wipe out all galaxies in its

... (read more)
I believe this is a misunderstanding: ASI will wipe out all human value in the universe.

That part was a bit unclear. I guess he could work with redwood/conjecture without necessarily quitting his MIRI position?

Suppose you lived in the dark times, where children have a <50% of living to adulthood. Wouldn't you still have kids? Even if probabilistically smallpox was likely to take them?

Just wanna add that each of you children individually having a 50% chance of survival due to smallpox is different from all of your children together having 50% chance of survival due to AI (i.e. uncorrelated vs correlated), so some people might decide differently in these 2 cases.

AFAIK, it is not necessary to "“accurately reverse engineer human values and also accurately encode them”.  That's considered too hard, and as you say, not tractable anytime soon. Further, even if you're able to do that, you've only solved outer alignment, inner alignment still remains unsolved.

Instead, the aim is to build "corrigible" AIs. See Let's See You Write That Corrigibility Tag, Corrigibility (Arbital),  Hard problem of corrigibility (Arbital).

Quoting from the last link:

The "hard problem of corrigibility" is to build an agent which, in a

... (read more)

Interesting example, but I still feel like Bob doesn't need to contradict Alice's known beliefs.

If Bob found a page from Alice's notebook that said "time and space are relative," he could update his understanding to realize that the theory of Newtonian physics he's been using is only an approximation, and not the real physics of the universe. Then, he could try to come up with upper bounds on how inaccurate Newtonian physics is, by thinking about his past experiences or doing new experiments. Even so, he could still keep using Newtonian physics, with the u... (read more)