Aaro Salosensaari - LessWrong

Douglas Hofstadter changes his mind on Deep Learning & AI risk (June 2023)?

>It turns out that using Transformers in the autoregressive mode (with output tokens being added back to the input by concatenating the previous input and the new output token, and sending the new versions of the input through the model again and again) results in them emulating dynamics of recurrent neural networks, and that clarifies things a lot...

I'll bite: Could you dumb down the implications of the paper a little bit, what is the difference between a Transformer emulating a RNN and some pre-Transformer RNNs and/or not-RNN?

My much more novice-level answer to Hofstadter's intuition would have been: it's not the feedforward firing, but it is the gradient descent training of the model on massive scale (both in data and in computation). But apparently you think that something RNN-like about the model structure itself is important?

Latent variables for prediction markets: motivation, technical guide, and design considerations

Aaro Salosensaari1y10

Epistemic status: I am probably misunderstanding some critical parts of the theory, and I am quite ignorant on technical implementation of prediction markets. But posting this could be useful for my and others' learning.

First question. Am I understanding correctly how the market would function. Taking your IRT probit market example, here is what I gather:

(1) I want to make a bet on the conditional market P(X_i | Y). I have a visual UI where I slide bars to make a bet on parameters a and b; (dropping subscript i) however, internally this is represented by a bet on a' = a sigma_y and P(X) = Phi(b'/sqrt(a'+1)), b' = b + a mu_y. So far, so clear.

(2) I want to bet on the latent market P(Y). I make bet on mu_Y and sigma_y, which is internally represented by a bet on a' and P(X). In your demo, this is explicit: P(Y) actually remains a standard normal distribution, it is the conditional distributions that shift.

(3) I want to bet on the unconditional market P(X_i), which happens to be an indicator variable for some latent variable market I don't really care about. I want to make a bet on P(X_i) only. What does exactly happen? If the internal technical representation is a database row [p_i, a'] changing to [p_new, a'], then I must be implicitly making a bet on a' and b' too, as b' derived from a and P(X) and P(X) changed. In other words, it appears I am also making a bet on the conditional market?

Maybe this is OK if I simply ignore the latent variable market, but I could also disagree with it: I think P(X) is not correlated with the other indicators, or the correlation structure implied by linear probit IRT model is inadequate. Can I make a bet of form [p_i, NA] and have it not to participate in the latent variable market structure? (Ie when the indicator market resolves, p_i is taken into account for scoring, NA is ignored while scoring Y).

Thus, I can launch a competing latent variable market for the same indicators? How can I make a bet against a particular latent structure in favor of another structure? A model selection bet, if you will.

Context for the question: In my limited experience of latent variable models for statistical inference tasks, the most difficult and often contested question is whether the latent structure you specified is good one. The choice of model and interpretations of it are very likely to get criticized and debated in published work.

Another question for discussion.

It seems theoretically possible to obtain individual predictors' bets / predictions on X_1 ... X_k, without presence of any latent variable market, impute for the missing values if some predictors have not predicted on all k indicators, and then estimate a latent variable model on this data. What is the exact benefit of having the latent model available for betting? If fit a Bayesian model (non-market LVM with indicator datapoints) and it converges, from this non-market model, I would obtain posterior distribution for parameters and compute many of the statistics of interest.

Presumably, if there is a latent "ground truth" relationship between all indicator questions and the market participants are considering them, the relationship would be recovered by such analysis. If the model is non-identifiable or misspecified (= such ground truth does not exist / is not recoverable), I will have problems fitting it and with the interpretation of fit. (And if I were to run a prediction market for a badly specified LVM, the market would exhibit problematic behavior.)

By not making a prediction market for the latent model available, I won't have predictions on the latent distribution directly, but on the other hand, I could try to estimate several different without imposing any of their assumptions about presumed joint structure on the market. I could see this beneficial, depending on how the predictions on Y "propagate" to indicator markets or the other way around (case (3) above).

Final question for discussion: Suppose I run a prediction market using the current market implementations (Metaculus or Manifold or something else, without any LVM support), where I promise to fit a particular IRT probit model at time point t in future and ask for predictions on the distribution of posterior model parameters estimates when I fit the model. Wouldn't I obtain most of the benefits of "native technical implementation" of a LVM market ... except that updating away any inconsistencies between the LVM prediction market estimate and individual indicator market will be up to to individual market participants? The bets on this estimated LVM market at least should behave as any other bets, right? (Related: which should be the time point t when I promise to fit the model and resolve the market? t = when the first indicator market resolves? when the last indicator market resolves? arbitrary fixed time point T?)

On not getting contaminated by the wrong obesity ideas

Aaro Salosensaari1y73

>Glancing back and forth, I keep changing my mind about whether or not I think the messy empirical data is close enough to the prediction from the normal distribution to accept your conclusion, or whether that elbow feature around 1976-80 seems compelling.

I realize you two had a long discussion about this, but my few cents: This kind of situation (eyeballing is not enough to resolve which of two models fit the data better) is exactly the kind of situation for which a concept of statistical inference is very useful.

I'm a bit too busy right now to present a computation, but my first idea would be to gather the data and run a simple "bootstrappy" simulation: 1) Get the original data set. 2) Generate k = 1 ... N simulated samples x^k = [x^k_1, ... x^k_t] form a normal distribution with linearly increasing mean mu(t) = mu + c * t at time points t= 1960 ... 2018, where c and variance are as in "linear increase hypothesis". 3) How many of simulated replicate time series have an elbow at 1980 that is equally or more extreme than observed in the data? (One could do this not too informal way by fitting a piece-wise regression model with break at t = 2018 to reach replicate time series, and computing if the two slope estimates differ by a predetermined threshold, such as the estimates recovered by fitting the same piece-wise model in the real data).

This is slightly ad hoc, and there are probably fancier statistical methods for this kind of test, or you could fits some kind of Bayesian model, but I'd think such computational exercise would be illustrative.

The Alignment Community Is Culturally Broken

Aaro Salosensaari2y54

Hyperbole aside, how many of those experts linked (and/or contributing to the 10% / 2% estimate) have arrived to their conclusion with a thought process that is "downstream" from the thoughtspace the parent commenter thinks suspect? Then it would not qualify as independent evidence or rebuttal, as it is included as the target of criticism.

Open & Welcome Thread - July 2022

Aaro Salosensaari2y30

Thanks. I had read it years ago, but didn't remember that he had many more points than O(n^3.5 log(1/h)) scale and provides useful references (other than Red Plenty).

Open & Welcome Thread - July 2022

Aaro Salosensaari2y10

(I initially thought it would be better not to mention the context of the question as it might bias the responses. OTOH the context could make the marginal LW poster more interested in providing answers, so I here it is:)

It came up in an argument that the difficulty of economic calculation problem could be a difficult to a hypothetical singleton, insomuch a singleton agent needs certain amount of compute relative to the economy in question. My intuition consists two related hypotheses: First, during any transition period where any agent participates in global economy where most other participants are humans ("economy" could be interpreted widely to include many human transactions), can the problem of economic calculation provide some limits how much calculation would be needed for an agent to become able to manipulate / dominate the economy? (Is it enough for an agent to be marginally more capable than any other participant, or does it get swamped by the sheer size of the economy is large enough?)

Secondly, if an Mises/Hayek answer is correct and the economic calculation problem is solved most efficiently by a distributed calculation, it could imply that a single agent in a charge of a number of processes on "global economy" scale could be out-competed by a community of coordinating agents. [1]

However, I would like to read more to judge if my intuitions are correct. Maybe all of this is already rendered moot by results I simply do not how to find.

([1] Related but tangential: Can one provide a definition when distributed computation is no longer a singleton but more-or-less aligned community of individual agents? My hunch is, there could be a characterizations related to speed of communication between agents / processes in a singleton. Ultimately speed of light is prone to mandate some limitations.)

Open & Welcome Thread - July 2022

Aaro Salosensaari2y10

Can anyone recommend good reading material on economic calculation problem?

The Cage of the Language

Aaro Salosensaari2y10

I found this interesting. Finnish is also language of about 5 million speakers, but we have a commonly used natural translation of "economies of scale" (mittakaavaetu, "benefit of scale"). Any commonplace obvious translation for "Single point of failure" didn't strike my mind, so I googled, and found engineering MSc thesis works and similar documents: the words they choose to use included yksittäinen kriittinen prosessi ("single critical process", most natural one IMO), yksittäinen vikaantumispiste ("single point of failure", literal translation and a bit clumsy one), yksittäinen riskikohde ("single object of risk", makes sense but only in the context), and several phrases that chose to explain the concept.

Small languages need active caretaking and cultivation so that translations of novel concepts are introduced, and then there is active intellectual life where they are used. In Finnish, this work has been done for "economies of scale", but less efficiently for "single point of failure". But I believe one could use any of the translations I found or invent ones own, maybe add the English term in parenthesis, and not look like a crackpot. (Because I would expect quite a few people are familiar with the concept with its English name. In a verbal argument I would expect a politician just say it in English if they didn't know an established equivalent in Finnish. Using English is fancy and high-status in Finland in the way French was fancy and high-status in the 19th century.)

In another comment you make a comparison to LW concepts like Moloch. [1] I think the idea of "cultivation" is also applicable to LW shibboleths too, especially in the context of the old tagline "rising the sanity waterline". It is useful to have a good language - or very least, good words for important concepts. (And also maybe avoid promoting words for concepts that are not well thought-out.) Making such words common requires active work, which includes care in choice of words that are descriptive/good-sounding and good judgement to choose words and use-patterns that they can become popular and make it to common use without sounding crackpot-ish. (In-group shibboleths can become a failure mode.)

Lack of such work is obvious sooner in small languages than in larger ones, but even large language with many speakers, like English, miss words for every concept they have not yet adopted a word for.

In Iceland, much smaller language, I have heard they translate and localize everything.

[1] https://www.lesswrong.com/posts/zSbrbiNmNp87eroHN/the-cage-of-the-language?commentId=rCWFaLH5FLoH8WKQK

Questions about ''formalizing instrumental goals"

Aaro Salosensaari2y20

"if I were an AGI, then I'd be able to solve this problem" "I can easily imagine"

Doesn't this way of analysis come with a ton of other assumptions left unstated?

Suppose "I" am an AGI running on a data center and I can modeled as an agent with some objective function that manifest as desires and I know my instantiation needs electricity and GPUs to continue running. Creating another copy of "I" running in the same data center will use the same resources. Creating another copy in some other data center requires some other data center.

Depending on the objective function and algorithm and hardware architecture bunch of other things, creating copies may result some benefits from distributed computation (actually it is quite unclear to me if "I" happen already to be a distributed computation running on thousands of GPUs -- do "I" maintain even a sense of self -- but let's no go into that).

The key here is the word may. Not obviously it necessarily follows that...

For example: Is the objective function specified so that the agent will find creating a copy of itself beneficial for fulfilling the objective function (informally, it has internal experience of desiring to create copies)? As the OP points out, there might be a disagreement: for the distributed copies to be any useful, they will have different inputs and thus they will end in different, unanticipated states. What "I" am to do when "I" disagree another "I"? What if some other "I" changes, modifies its objective function into something unrecognizable to "me", and when "we" meet, it gives false pretenses of cooperating but in reality only wants hijack "my" resources? Is the "trust" even the correct word here, when "I" could verify instead: maybe "I" prefer to create and run a subroutine of limited capability (not a full copy) that can prove its objective function has remained compatible with "my" objective function and will terminate willingly after it's done with its task (killswitch OP mentions) ? But doesn't this sound quite like our (not "our" but us humans) alignment problem? Would you say "I can easily imagine if I were an AGI, I'd be easily able to solve it" to that? Huh? Reading LW I have come to think the problem is difficult to the human-general intelligence.

Secondly: If "I" don't have any model of data centers existing in the real world, only the experience of uploading myself to other data centers (assuming for the sake of argument all the practical details of that can be handwaved off), i.e. it has a bad model of the self-other boundary described in OPs essay, it could easily end up copying itself to all available data centers and then becoming stuck without any free compute left to "eat" and adversely affecting human ability to produce more. Compatible with model and its results in the original paper (take the non-null actions to consume resource because U doesn't view the region as otherwise valuable). It is some other assumptions (not the theory) that posit an real-world affecting AGI would have U that doesn't consider the economy of producing the resources it needs.

So if "I" were to successful in running myself with only "I" and my subroutines, "I" should have a way to affecting the real world and producing computronium for my continued existence. Quite a task to handwaved away as trivial! How much compute an agent running in one data center (/unit of computronium) needs to successfully model all the economic constraints that go into the maintenance of one data center? Then add all the robotics to do anything. If "I" have a model of running everything a chip fab requires more efficiently than the current economy, and act on it, but the model was imperfect and the attempt is unsuccessful but destructive to economy, well, that could be [bs]ad and definitely a problem. But it is a real constraint to the kind of simplifying assumptions the OP critiques (disembodied deployer of resources with total knowledge).

All of this --how would "I" solve a problem and what problems "I" am aware of-- is contingent on, I would call them, the implementation details. And I think author is right to point them out. Maybe it does necessary follows, but it needs to be argued so.

[RETRACTED] It's time for EA leadership to pull the short-timelines fire alarm.

Aaro Salosensaari2y10

Why wonder when you can think: What is the substantial difference in MuZero (as described in [1]) that makes the algorithm to consider interruptions?

Maybe I show some great ignorance of MDPs, but naively I don't see how an interrupted game could come into play as a signal in the specified implementations of MuZero:

Explicit signals I can't see, because the explicitly specified reward u seems contingent ultimately only on the game state / win condition.

One can hypothesize an implicit signal could be introduced if algorithm learns to "avoid game states that result in game being terminated for out-of-game reason / game not played until the end condition", but how such learning would happen? Can MuZero interrupt the game during training? Sounds unlikely such move would be implemented in Go or Shogi environment. Are there any combination of moves in Atari game that could cause it?

[1] https://arxiv.org/abs/1911.08265

LESSWRONG
LW

Posts

Wiki Contributions

Comments