How do you determine who gets the first 3? Maybe lsusr will be kind enough to provide a symmetry-breaking bit in the "extra" package. (It would only be fair, given that bots playing themselves are automatically given max score.) If not, and you have to do things the hard way, do you compare source code alphabetically, and favour X over Y on even rounds and Y over X on odd rounds?
Also, it may be a good idea to make the level of defection against outsiders depend on the round number. i.e. cooperate at first to maximize points, then after some number of rounds, when you're likely to be a larger proportion of the pool, switch to defecting to drive the remaining bots extinct more quickly.
Another objection is that improvements in biological intelligence will tend to feed into improvements in artificial intelligence. For example, maybe after a couple of generations of biological improvement, the modified humans will be able to design an AI that quickly FOOMs and overtakes the slow generation by generation biological progress.
(It seems likely that once you've picked the low hanging fruit like stuffing people's genomes as full of intelligence-linked genes as possible without giving them genetic diseases, it will be much easier to implement any new intelligence improvements you can think of in code, rather than in proteins. The human brain is a much more sophisticated starting point than any current AI programs, but is probably much harder to modify significantly.)
That's very cool, thanks for making it. At first I was worried that this meant that my model didn't rely on selection effects. Then I tried a few different random seeds, and some, like 1725, didn't show demon-like behaviour. So I think we're still good.
No regularization was used.
I also can't see any periodic oscillations when I zoom in on the graphs. I think the wobbles you are observing in the third phase are just a result of the random noise that is added to the gradient at each step.
Thanks, and your summary is correct. You're also right that this is a pretty contrived model. I don't know exactly how common demons are in real life, and this doesn't really shed much light on that question. I mainly thought that it was interesting to see that demon formation was possible in a simple situation where one can understand everything that is going on.
Thanks. I initially tried putting the code in a comment on this post, but it ended up being deleted as spam. It's now up on github: https://github.com/DaemonicSigil/tessellating-hills It isn't particularly readable, for which I apologize.
The initial vector has all components set to 0, and the charts show the evolution of these components over time. This is just for a particular run, there isn't any averaging. x0 gets its own chart, since it changes much more than the other components. If you want to know how the loss varies with time, you can just flip figure 1 upside down to get a pretty good proxy, since the splotch functions are of secondary importance compared to the -x0 term.
Here is the code for people who want to reproduce these results, or just mess around:
import numpy as np
import matplotlib.pyplot as plt
DIMS = 16 # number of dimensions that xn has
WSUM = 5 # number of waves added together to make a splotch
EPSILON = 0.0025 # rate at which xn controlls splotch strength
TRAIN_TIME = 5000 # number of iterations to train for
LEARN_RATE = 0.2 # learning rate
# knlist and k0list are integers, so the splotch functions are periodic
knlist = torch.randint(-2, 3, (DIMS, WSUM, DIMS)) # wavenumbers : list (controlling dim, wave id, k component)
k0list = torch.randint(-2, 3, (DIMS, WSUM)) # the x0 component of wavenumber : list (controlling dim, wave id)
slist = torch.randn((DIMS, WSUM)) # sin coefficients for a particular wave : list(controlling dim, wave id)
clist = torch.randn((DIMS, WSUM)) # cos coefficients for a particular wave : list (controlling dim, wave id)
# initialize x0, xn
x0 = torch.zeros(1, requires_grad=True)
xn = torch.zeros(DIMS, requires_grad=True)
# numpy arrays for plotting:
x0_hist = np.zeros((TRAIN_TIME,))
xn_hist = np.zeros((TRAIN_TIME, DIMS))
for t in range(TRAIN_TIME):
wavesum = torch.sum(knlist*xn, dim=2) + k0list*x0
splotch_n = torch.sum(
(slist*torch.sin(wavesum)) + (clist*torch.cos(wavesum)),
foreground_loss = EPSILON * torch.sum(xn * splotch_n)
loss = foreground_loss - x0
# constant step size gradient descent, with some noise thrown in
vlen = torch.sqrt(x0.grad*x0.grad + torch.sum(xn.grad*xn.grad))
x0 -= LEARN_RATE*(x0.grad/vlen + torch.randn(1)/np.sqrt(1.+DIMS))
xn -= LEARN_RATE*(xn.grad/vlen + torch.randn(DIMS)/np.sqrt(1.+DIMS))
x0_hist[t] = x0.detach().numpy()
xn_hist[t] = xn.detach().numpy()
plt.xlabel('number of steps')
for d in range(DIMS):
plt.xlabel('number of training steps')