pmarc
pmarc has not written any posts yet.

Thank you for this post! From a statistics (rather than computer science) background, I have encountered similar discussions in the context of Bayesian model averaging, and in particular I would recommend this publication if you don't already know about it:
"Using Stacking to Average Bayesian Predictive Distributions"
https://sites.stat.columbia.edu/gelman/research/published/stacking_paper_discussion_rejoinder.pdf
One of the main limitations they note about Bayes factors, the classic type of Bayesian model averaging, is that they are sensitive to how vague your initial priors were for the adjustable parameters of your competing models, so I'm not sure how much it applies to your example. It depends whether or not you think of your competing hyptoheses as having free parameters to estimate before making... (read more)
I agree there might not be a way to differentiate (2) and (3) in the context of the current LLM framework. From a safety point of view, I would think that if that LLM framework scales to a dangerous capability level, (2) is as dangerous as (3) if the consequences are the same. In particular, that would suggest whatever inspiration the LLM is getting from stories of "selfish AI" in the training set is strong enough to make it override instructions.
If LLMs don't scale to that point, then maybe (2) is optimistic in the sense that the problem is specific to the particular way LLMs are trained. In other words, if it's... (read more)
Thank you for sharing the whitepaper.
I was a bit shocked to read:
This is a non-trivial task, since existing analysis methods struggle to identify combinatorial, non-linear patterns. Our collaborators at IPSiM told us that they would typically spend ‘months scrolling in excel’ to make sense of this data – in contrast to our automated system, which identified these relationships in less than an hour.
coming from a closely related field (ecology research) where, 10 years ago, statistical software like R was in wide use and ML methods like random forests were already gaining in popularity. But of course, my perspective is biased from spending a lot more time in the "quant" circles within the... (read more)
Yes, when trying to reuse the OP's phrasing, maybe I wasn't specific enough on what I meant. I wanted to highlight how the "fraction of variance explained" metric generalized less that other outputs from the same model.
For example, if you conceive a case where a model of E[y] vs. x provides good out-of-sample predictions even if the distribution of x changes, e.g. because x stays in the range used to fit the model, the fraction of variance explained is nevertheless sensitive to the distribution of x. Of course, you can have a confounder w that makes y(x) less accurate out-of-sample because its distribution changes and indirectly "breaks" the learned y(x) relationship, but... (read more)
Yes, in expectation. But you're adding a lot of variance. I thought the stochasticity of the punishment affects its effectiveness, and that's a tradeoff you make to get the causal structure.
Assuming the 99.9% / 0.1% trick does work and there are large numbers of markets to compensate for the small chance of any given market resolving, what would be the defense against actors putting large bets on a single market with the sole intent of skewing the signal? If the vast majority of bets are consequence-free, it seems:
(1) the cost of such an operation would be comparatively cheaper, and
(2) the incentive for rational profit-seeking traders to put enough volume of counter-bets to "punish" that would be comparatively smaller,
than in a regular (non-N/A resolving) market.
With regards to the super-scientist AI (the global human R&D equivalent), wouldn't we see it coming based on the amount of resources it would need to hire? Are you claiming that it could reach the required AGI capacity in its "brain in a box in a basement" state and only after scale up in terms of resource use? The part I'm most skeptical about remains this idea that the resource use to get to human-level performance is minimal if you just find the right algorithm, because at least in my view it neglects the evaluation step in learning that can be resource intensive from the start and maybe can't be done "covertly".
---
That... (read more)
Maybe the problem is that we don't have a good metaphor for what the path for "rapidly shooting past human-level capability" is like in a general sense, rather than on a specific domain.
One domain-specific metaphor you mention is AlphaZero, but games like chess are an unusual domain of learning for the AI, because it doesn't need any external input beyond the rules of the game and objective, and RL can proceed just by the program playing against itself. It's not clear to me how we can generalize the AlphaZero learning curve to problems that are not self-contained games like that, where the limiting factor may not be computing power or memory, but just the availability (and rate of acquisition) of good data to do RL on.
This post inspires for me two lines of thought.
Groups of humans can create $1B/year companies from scratch [...]
If we're thinking of the computing / training effort to get to that point "from scratch", how much can we include? I have Newton's "standing on the shoulders of giants" quote in mind here. Do we include the effort necessary to build the external repositories of knowledge and organizational structures of society that make it possible to build these $1B/year companies within a modern human lifetime and with our individually computationally-limited human brains? Do we expect the "brain-like" (in terms of computational leanness) AGI to piggy-back on human structures (which maybe brings it closer to... (read more)
Re: Point 1, I would consider the hypothesis that some form of egalitarian belief is dominant because of its link with the work ethic. The belief that the market economy rewards hard work implies some level of equality of opportunity, or the idea that most of the time, pre-existing differences can be overcome with work. As an outside observer to US politics, it's very salient how every proposal from the mainstream left or right goes back to that framing, to allow a fair economic competition. So when the left proposes redistribution policies, it will be framed in terms of unequal access to opportunities. That said, it's possible to propose redistribution policies or... (read more)