Interested in many things. I have a personal blog at

Wiki Contributions


Thanks for the response! Very helpful and enlightening.

The reason for this is actually pretty simple: genes with linear effects have an easier time spreading throughout a population.

This is interesting -- I have never come across this. Can you expand the intuition of this model a little more? Is the intuition something like in the fitness landscape genes with linear effects are like gentle slopes that are easy to traverse vs extremely wiggly 'directions'? 

Also how I am thinking about linearity is maybe slightly different to the normal ANOVA/factor analysis way, I think. I.e. let's suppose that we have some protein which is good so that more of it is better and we have 100 different genes which can either upregulate or down regulate it. However, at some large number, say 80x the usual amount, the benefit saturates. So a normal person is very unlikely to have 80/100 positive variants but if we go in and edit all 100 to be positive, we only get the maximum benefit far below what we would have predicted since it maxes out at 80. I guess to detect this nonlinearity in a normal population you basically need to get an 80+th order interaction of all of them interacting in just the right way which is exceedingly unlikely. Is this your point about sample size?

I'll talk about this in more detail within the post, but yes we have examples of monogenic diseases and cancers being cured via gene therapy.

This is very cool. Are the cancer cures also monogenic? Has anybody done any large scale polygenic editing in mice or any other animal before humans? This seems the obvious place to explicitly test the causality and linearity directly. Are we bottlenecked on GWAS equivalents for other animals?

This would be very exciting if true! Do we have a good (or any) sense of the mechanisms by which these genetic variants work -- how many are actually causal, how many are primarily active in development vs in adults, how much interference there is between different variants etc? 

I am also not an expert at all here -- do we have any other examples of traits being enhanced or diseases cured by genetic editing in adults (even in other animals) like this? It seems also like this would be easy to test in the lab -- i.e. for mice which we can presumably sequence and edit more straightforwardly and also can measure some analogues of IQ with reasonable accuracy and reliability. Looking forward to the longer post.

This is an interesting idea. I feel this also has to be related to increasing linearity with scale and generalization ability -- i.e. if you have a memorised solution, then nonlinear representations are fine because you can easily tune the 'boundaries' of the nonlinear representation to precisely delineate the datapoints (in fact the nonlinearity of the representation can be used to strongly reduce interference when memorising as is done in the recent research on modern hopfield networks) . On the other hand, if you require a kind of reasonably large-scale smoothness of the solution space, as you would expect from a generalising solution in a flat basin, then this cannot work and you need to accept interference between nearly orthogonal features as the cost of preserving generalisation of the behaviour across many different inputs which activate the same vector.

Looks like I really need to study some SLT! I will say though that I haven't seen many cases in transformer language models where the eigenvalues of the Hessian are 90% zeros -- that seems extremely high.

I also think this is mostly a semantic issue. The same process can be described in terms of implicit prediction errors where e.g. there is some baseline level of leptin in the bloodstream that the NPY/AgRP neurons in the arcuate nucleus 'expect' and then if there is less leptin this generates an implicit 'prediction error' in those neurons that cause them to increase firing which then stimulates various food-consuming reflexes and desires which ultimately leads to more food and hence 'correcting' the prediction error. It isn't necessary that anywhere there are explicit 'prediction error neurons' encoding prediction errors although for larger systems it is often helpful to modularize it this way. 


Ultimately, though I think it is more a conceptual question of how to think about control systems -- is it best to think in terms of implicit prediction errors or just in terms of the feedback loop dynamics but it amounts to the same thing

This is where I disagree! I don't think the Morrison and Berridge experiment demonstrates model-based side. It is consistent with model-based RL but is also consistent with model-free algorithms that can flexibly adapt to changing reward functions such as linear RL. Personally, I think this latter is more likely since it is such a low level response which can be modulated entirely by subcortical systems and so seems unlikely to require model-based planning to work

Thanks for linking to your papers and definitely interesting you have been thinking along similar lines. I think the key reason I think studying this is important is that I think that these hedonic loops demonstrate that a.) Mammals including humans are actually exceptionally aligned to basic homeostatic needs and basic hedonic loops I'm practice. It is extremely hard and rare for people to choose not to follow homeostatic drives. I think humans are mostly 'misaligned' about higher level things like morality, empathy etc is because we dont actually have direct drives hardcoded in the hypothalamus for them the way we do for primary rewards. Higher level behaviours either socio-culturally learned through unsupervised critically based learning or derived from RL extrapolations from primary rewards. It is no surprise that alignment to these ideals is weaker. B.) That relatively simple control loops are very effective at controlling vastly more complex unsupervised cognitive systems.

I also agree this is similar to steven Byrnes agenda and maybe just my way to arrive at it

This is definitely possible and is essentially augmenting the state variables with additional homeostatic variables and then learning policies on the joint state space. However there are some clever experiments such as the linked Morrison and Berridge one demonstrating that this is not all that is going on -- specifically many animals appear to be able to perform zero-shot changes in policy when rewards change even if they have not experienced this specific homeostatic variable before -- I.e. mice suddenly chase after salt water which they previously disliked when put in a state of salt deprivation which they had never before experienced

The 'four years' they explicitly mention does seem very short to me for ASI unless they know something we don't...

AI x-risk is not far off at all, it's something like 4 years away IMO

Can I ask where this four years number is coming from? It was also stated prominently in the new 'superalignment' announcement ( Is this some agreed upon median timelines at OAI? Is there an explicit plan to build AGI in four years? Is there strong evidence behind this view -- i.e. that you think you know how to build AGI explicitly and it will just take four years more compute/scaling?

Load More