LESSWRONG
LW

MikailKhan — LessWrong

Replying toAdult Neurogenesis – A Pointed Review

A new study was published today with results that contradict those found in the UCSF study that you've written about.

Human Hippocampal Neurogenesis Persists Throughout Aging

The Sorrells et al study is directly mentioned twice in this paper. The first claim is that the study failed to address medicaton and drug use, which impact adult hippocampal neurogenesis.

The density of doublecortin-positive (DCX+) cells were re- ported to decline from birth into the tenth decade of life (Knoth et al., 2010) in parallel with 14C-determined neuron turnover(Bergmann et al., 2015); however, medication and drug use, which affect AHN (Boldrini et al., 2014), were not addressed (Knoth et al., 2010; Sorrells et al., 2018; Spalding et al.,

... (read 466 more words →)

Replying toSchelling Shifts During AI Self-Modification

MikailKhan8y

Schelling Shifts During AI Self-Modification

Thank you for your input, I found it very informative!

I agree with your point that any aligned AI will be 100% on board with avoiding value drift, and that certainly does take pressure off of us when it comes to researching this. I also agree that it would be best to avoid this scenario entirely and avoid having a self-improving AI touch its value function at all.

In cases where a self-improving AI can alter its values, I don’t entirely agree that this would only be a concern at subhuman levels of intelligence. It seems plausible to me that an AI of human level intelligence, and maybe slightly higher, could think that... (read more)

Schelling Shifts During AI Self-Modification

MikailKhan

Introduction

In this essay, I name and describe a mechanism that might break alignment in any system that features an originally aligned AI improving its capabilities through self-modification. I doubt that this idea is new, but I haven't yet seen it named and described in these terms. This post discusses this mechanism within the context of agents that iteratively improve themselves to surpass human intelligence, but the general idea should be applicable to most self-modification schemes.

This idea directly draws inspiration from Scott Alexander’s Schelling Fences on Slippery Slopes.

Schelling Shifts

A Schelling shift occurs when an agent fails to properly anticipate that modifying a parameter will increase its next version’s willingness to further modify the... (read 1553 more words →)