# Algon's Shortform

This is a special post for quick takes by Algon. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.

In "Proofs and Refutations", Imre Laktos[1] portrays a socratic discussion between teacher and students as they try to prove Euler's theorem V+F =E+2. The beauty of this essay is that the discussion mirrors the historical development of the subject, whilst also critiquing the formalist school of thought, the modern agenda of meta-mathematics, and how it doesn't fit the way mathematics is done in practice. Whilst I'm on board with us not knowing how to formalize mathematical practice, I think it is a solvable problem. Moreover, I am a staunch ultra-finist believer in reality being computable, which dovetails with a belief in proofs preserving truth. Yet that matters little in the face of such a fantastic exposition on mathematical discovery.[2] Moreover, it functions as a wonderful example of how to alternate between proving and disproving a conjecture, whilst incorporating the insights we gain into our conjecture.

In fact, it was so stimulating that after reading it I came up with two other proofs on the spot, though the core idea is the same as in Laktos' proof. After that experience, I feel like showing a reader how a proof is generated is a dang good substitute for interacting with a mathematician in real life. Mathematics is one of the few areas where text and images, if read carefully, can transfer most tacit information. We need more essays like this.[3]

Now, if only we could get professor's to force students to try and prove theorems within the lecture, dialogue with them and transcribe the process. Just think, when the professor comes to the "scrib together lecture material and turn it into a textbook" part of their lifecycle, we'd automatically get beautiful expositions. Please excuse me whilst I go cry in a corner over unattainable dreams.[4]

1. ^

Not a Martian as he wasn't born in Budapest.

2. ^

And how weird things were in the days before Hilbert mastered mathematics and brought rigour to the material world. Listen to this wildly misleading quote: "In the 19th century, geometers, besides finding new proofs of the Euler theorem, were engaged in establishing the exceptions which it suffers under certain conditions." From  p. 36, foot note 1.

3. ^

Genealized Heat Engine and Lecture 9 of Scott Aaronson's democritus lectures on QM are two other expositions which are excellent, though not as organic as Proofs and Refutations.

4. ^

Funnily enough, Euler's papers contain clear descriptions of how he came to the proofs. At least, those I've read. Which is, like, 1/10,000th of his material by word count. I'm not kidding.[5][6]

5. ^

http://archive.boston.com/bostonglobe/ideas/brainiac/2012/11/the_100-year_pu.html

6. ^

http://eulerarchive.maa.org/

When tracking an argument in a comment section, I like to skip to the end to see if either of the arguers winds up agreeing with the other. Which tells you something about how productive the argument is. But when using the "hide names" feature on LW, I can't do that, as there's nothing distinguishing a cluster of comments as all coming from the same author.

I'd like a solution to this problem. One idea that comes to mind is to hash all the usernames in a particular post and a particular session, so you can check if the author is debating someone in the comments without knowing the author's LW username. This is almost as good as full anonymity, as my status measures take a while to develop, and I'll still get the benefits of being able to track how beliefs develop in the comments.

@habryka

Yeah, I think the hide author feature should replace everyone with single letters or something, or give you the option to do that. If someone wants to make a PR with that, that would be welcome, we might also get around to it otherwise at some point (but it might take a while)

I've been thinking about exercises for alignment, and I think going through a list of lethalities and applying them to an alignment propsal would be a good one. Doing the same with Paul's list would be a bonus challenge. If I had some pre-written answer sheet for one proposal, I could try the exercise my self to see how useful it would be. This post, which I haven't read yet, looks like it would serve for the case of RLHF. I'll try it tomorrow and report back here.

I am very glad the Lightcone team made the discussions feature. Comment threads on LW are about as valuable as the posts themselves, and this discussions just puts comment-threads on equal footing with posts. Obvious in retrospect. Why wasn't it done earlier though?

Hypothesis: agency violating phenomena should be thought of as edge-cases which show that our abstractions of ourselves as agents are leaky.

For instance, look at addictive substances like heroin. These substances break down our Cartesian boundary (our intuitive seperation of the world into ourselves and the environment with a boundary) by chemically assaulting the reward mechanisms in our brain.

However, video games or ads don't obviously violate our Cartesian boundary, which may be one of many boundaries we assume exist. Which, if my hypothesis is true, suggests that you could try to find other boundaries/abstractions violated by those phenomena. Other things which "hack" humans, like politics or psyops, would violate boundaries as well.

Finding the relevant abstractions and seeing how they break would increase our understanding of ourselves as agents. This could help triangulate a more general definition of agency for which these other boundaries are special cases or approximations.

This seems like a hard problem.  But just building a taxonomy for our known abstractions for agency is less useful but much more feasible for a few months work. Sounds like a good research project.

My mind keeps returning to exercises which could clarify parts of alignment, both for me and others. Some of them are obvious: think about what kind of proof you'd need to solve alignment, what type of objects it would have to be talking about etc. and see whether that implies having a maths oracle would make the problem easier. Or try and come up with a list of human values to make a utility function and see how it breaks down under greater optimization pressure.

But what about new exercises? For skillsets I've never learnt? Well, there's the security mindset, which I don't have. I think it is about "trying to break everything you see", so presumably I should just spend a bunch of time breaking things or reading the thoughts of people who deeply inhabit this perspective for more tacit knowledge. For the former, I could do something like exercises for computer secuirty: https://hex-rays.com/ida-free/ For the latter, I've heard "Silence on the Wire" is good: the author is supposedly a hacker's hacker, and writes about solutions to security challenges which defy classification. Seeing solutions to complex, real world problems is very important to developing expertise.

But I just had a better thought: wouldn't watching someone hacking something be a better exmaple of the security mindset? See the problem they're tackling and guess where the flaws will be. That's the way to really acquire Tacit knowledge. In fact, looking at the LockPickingLawyer's channel is what kicked off this post. There, you can see every lock under the sun picked apart in minutes. Clearly, the expertise is believable. So maybe a good exercise for showing people that security mindset exists, and perhaps to develop it, would be getting a bunch of these locks and their design specs, giving people some tools, and asking them to break them. Then show them how the lock picking lawyer does it. Again, and again and again.

One thing I'm confused about re: human brain efficiency is, if our brain's advantage over apes is just scaling and some minor software tweaks to support culture, what's that imply for Corvid brains? If you scaled Corvid brains up by the human-cortical-neuroun-count/chimp-cortical-neuoron-count, and gave them a couple of software tweaks, wouldn't you get a biological Pareto improvement over human brains?

Obvious thing I never thought of before:

Linear optimization where your model is of the form , the   being matrices, will likely result in an effective model of low rank if, you randomize the weights. Compared to just a single matrix -- to which the problem is naively mathematically identical, but not computationally -- this model won't be able to learn the identity function, or rotations or so on when n is large.

Note: Someone else said this on a gathertown meetup. The context was, that it is a bad idea to think about some ideal way of solving a problem, and then assume a neural net (or indeed any learning algorithm) would learn it. Instead, focus on the concrete details of the model you're training.

wow I'm not convinced that won't work. the only thing initializing with random weights should do is add a little noise. the naive mathematical interpretation should be the only possible interpretation up to your float errors, which, to be clear, will be real and cause the model to be invisibly slightly nonlinear. but as long as you're using float32, you shouldn't even notice.

[trying it, eta 120 min to be happy with results... oops I'm distractable...]

EDIT: Sorry, I tried something different. I fed in dense layers, followed by batchnorm, then ReLU. I ended it with a sigmoid, because I guess I just wanted to constrain things to the unit interval. I tried up to six layers. The difference in loss was not that large, but it was there. Also, hidden layers were 30 dim.

I tried this, and the results were a monotonic decrease in performance after a single hidden layer. My dataset was 100k samples of a 20 dim tensor sampled randomly from [0,1] for X, with Y being a copy of X. Loss was MSE, optimizer was adam with weight decay 0.05, lr~0.001 , minibatch size was 32, trained for 100,000 steps.

Also, I am doubtful of the mechanism being a big thing (rank loss) for such small depths. But, I do think there's something to the idea. If you multiply a long sequence of matrices, then I expect them to get extremely large, extremely small, or tend towards some kind of equillibrium. And then you have numerical stability issues and so on, which I think will ultimately make your big old matrix just sort of worthless.

oh by linear layer you meant nonlinear layer, oh my god, I hate terminology. I thought you literally just meant matrix multiplies

My body is failing me. I have been getting colds near weakly for a year and a half, after a particularly wretched cold. My soul is failing me. I have been worn down by a stressful environment, living with an increasingly deranged loved one. By my crippled minds inability to meet the challenge. Which causes body to further fail. Today, I grokked that I am in a doom spiral, headed down the same path as my kin's. I don't wish for so wretched an end, for an end it shall be.

But why my failing soul? Why does the algorithm which calls itself Algon fail when challenged so? Because the piece which calls itself Algon is blind to what the rest of his soul says, and so it takes action. He reshapes himself to be a character which will bring things to a head, as he knew it would eventually come to. Burst out in anger, and maybe the collapse won't break all my kin.

What shall I do now? The goal is restoring my deranged kin to sanity. The path must involve medication of a sort, and more care than I am currently shaped to give. The obstacles are wealth, and a kin-folk's fear of medication. With that one, wrath has a poor tool compared to raw truth. And perhaps, with wealth, they may be able to give the care needed to our deranged kin.

Courage is needed, or the removal of fear. And I shall do so the only way I know how: by holding it with me, looking ever closer, till it has no power over me.

After thinking about how to learn to notice the feel of improving in pursuit of the meta, I settled on trying to reach the meta-game in a few video games.

After looking at some potential games to try, I didn't follow up on them and kept playing die in the dungeon. Nothing much happened, until my improvement decellarated. For whatever reason, I chose to look in the comments section of the game for advice. I rapidly found someone claiming they beat the game by removing every dice except 2 attack, 1 defence, 1 heal and one boost die and upgrading them to the max. Supposedly, predictability was the main benifit, as you draw five random die from your deck each turn.[1]

Fighting against my instincts, I followed the advice. And won. Just, completely destroying every boss in my way. Now, maybe this is what the feel of improving in pursuit of the meta looks like. "Search for advice that seems credible but feels counter-intuitive[2], try it and see it makes sense, improve and repeat"?

EDIT: Feeling lacking cause I didn't try to immediately break this hypothesis. First, isn't this just "listen to good advice?" If so, I do sometimes feel like I am ignoring advice from credible people. But the mental examples I'm thinking of right now, like beating Angband, don't have much to do with meta-games. Should I be looking at the pre-requisites for meta-game skills and just aiming for those? But aren't many of them too hard to try out and make sense of without building up other skills first? In which case, perhaps the core feeling is more like finally understanding inscrutable advice. In which case, I guess I need to look for some game where the advice doesn't seem effective when I try it out?

Yet again, that isn't enough. Many skills make you worse when you first try them out, as you need to figure out how to apply them at all. Give a martial artist a sword for the first time and they'll lose faster. And many people hear advice from experts and think they understand it without really getting it. So advice for people who are just below experts doesn't have to appear inscrutable, though it may well be inscrutable. Am confused about what to do now.

1. ^

Yes, I should have tried reducing variance earlier. I am a dum-dum.

2. ^

Healing/defence die seemed more valuable to me, alongisde a couple of mirror die.

Sci-hub has a telegram bot which you can use with a desktop application. It is fast, and more importantly reliable. No more scrounging through proxy lists to find a link that works. Unfortunately, you need to install telegram on a phone first. Still, it has saved me some time and is probably necessary for an independant researcher.

Applying to the job in this tweet by NatFriedman and I think writing this shortform is evidence that I am the kind of person who does a) and understands b)

[+][comment deleted]10