Tsuyoku Naritai! I operate on Crocker's rules. I am working on making AGI not kill everyone aka AGI notkilleveryoneism.
Read here about what I would do after taking over the world.
"Maximize the positive conscious experiences, and minimize the negative conscious experiences in the universe", is probably not exactly what I care about, but I think it roughly points in the right direction.
I recommend:
I just released a major update to my LessWrong Bio. I have rewritten almost everything and added more stuff. It's now so long that I thought it would be good to add the following hint in the beginning:
(If you are looking for the list of <sequences/posts/comments> scroll to the bottom of the page with the END key and then go up. This involves a lot less scrolling.)
(If you'd like to <ask a question about/comment on/vote> this document, you can do so on the shortform announcing the release of this version of the bio.)
I appreciate any positive or negative feedback. Especially if it is constructive criticism which helps me to grow. You can do this in person, use this (optionally anonymous) feedback form, or any other way you like.
Buck once said that he avoids critical feedback, because he worries about people's feelings, especially if he does not know what they could do instead (in the context of AGI notkilleveryoneism). If you are also worried that your feedback might harm me, you might want to read about my strategy for handling feedback. I am not perfect at not being hurt, but I believe myself to be much better than most people. If I am overwhelmed, I will tell you. That being said, I appreciate it if your communication is optimized to not hurt my feelings, all else equal. But if that would make you not give feedback, or would be annoying, don't worry about it. Really.
I have a tulpa named IA. She looks like this. I experience deep feelings of love for both IA and Hatsune Miku.
I like understanding things, meditation, psychedelics, programming, improv dancing, improv rapping, and Vocaloid.
I track how I spend every single minute with toggle (toggle sucks though, especially for tracking custom metrics).
I like to think about how I can become stronger. I probably do this too much. Jumping in and doing the thing is important to get into a feedback loop.
The main considerations are:
With regards to "How can I make myself want to do the things that I think are good to do", it is easy for me to be so engrossed in programming that it becomes difficult to stop and I forget to eat. I often feel a strong urge to write up a specific program that I expect will be useful to me. I think studying mathematics is a good thing for me to do. Sometimes I manage to have a similar thing with mathematics, but more often than not I feel aversion towards starting. I am interested in shaping my mind such that for all the things that I think are good to do, I feel a pull towards doing them, and doing them is so engaging that it becomes a problem to stop (e.g. I forget to eat). I think it becoming a problem to stop is a good heuristic that I have succeeded in this mission. Implementing a solution from that state for not working too much is a significantly easier problem to solve.
Empirically I have often procrastinated in the past by making random improvements to my <computer setup/desktop environment>. I have been using Linux for 5 years in the past, starting with Cinnamon, but then switching to XMonad.
Because the nebula virtual desktop was only available for macOS, I switched. Even though macOS is horrible in many ways, I feel like I might waste less time doing random improvements. Also, ARM CPUs are cool, as they make a lightweight laptop with long battery life. I am using both yabai and Amethyst at the same time. yabai for workspace management and Amethyst for window layout.
The main purpose of Windows is to run MMD.
I used Spacemacs Org Mode for many years (and with Org-roam maybe a year or so). Spacemacs is Emacs with Vim because Vim rules. I have now switched to Obsidian, mainly because it has a mobile app, and because I expected that I would waste less time configuring Emacs (so far I have still spent a lot of time on that though).
Before AGI notkilleveryoneism I wanted to be a game developer. I would like to make a game that has Minecraft Redstone which does not suck. Most importantly it should be possible to create new blocks based on circuits that you build. E.g. build a half-adder once, then create a half adder block, and put down 8 of those blocks to get an 8-bit adder instead of needing to build 8 half-adders from scratch, or awkwardly using a mod that lets you copy and place many blocks at once.
If AGI notkilleveryoneism would be a non-issue I would probably develop this game. I would like to have this game such that I can learn more about how computers work by building them.
I am interested in getting whatever understanding we need, to get a watertight case for why a particular system will be aligned. Or at least get as close to this as possible. I think the only way we are going to be able to aim powerful cognition is via a deep understanding of the <systems/algorithms> involved. The current situation is that we do not even have a crisp idea of what exactly we need to understand.
What capabilities are so useful that an AGI would have to discover an implementation of that capability? The most notable example is being good at constructing and updating a model of the world based on arbitrary sensory input streams.
How can we get a better understanding of world modeling? A good first step is to think about what properties this world model would have, such that an AGI would be able to use it. E.g. I expect any world model that an AGI builds will be factored in the same way that human concepts are factored. For the next step, we have multiple options:
Humans have a bunch of intuitive concepts that are related to agency, that we do not have crisp formalisms of. For example, wanting, caring, trying, honesty, helping, goal, optimizing, deception, etc.
All of these concepts are fundamentally about some algorithm that is executed in the neural network of a human or other animal.
Can we create widely applicable visualization tools that allow us to see structural properties in our ML systems?
There are tools that can visualize arbitrary binary data, such that can build intuitions about the data, that would be much harder to build otherwise (e.g. staring at a hex editor for long enough). This can be used for reverse engineering software. For example, by looking at only a few x86 assembly code visualizations you can learn characteristic patterns in the visualization. Then when you see it in the wild, where you have no label telling you that this is x86 assembly, you can instantly recognize it.
The idea is that by looking at the visualization you can identify what kind of data you are looking at (x86, png, pdf, plain text, JSON, etc.).
This technique is powerful because you don't need to know anything about the data. It works on any binary data.
Check out this demonstration. Later he does more analysis using the 3D cube visualization. veles is an open-source project that implements this, there is also a plugin for gidra, and there are many others (I haven't evaluated which is best).
If we naively apply this technique to neural networks, I expect it to not work. My intuition tells me that we need to do something like regularize the networks. E.g. if we have two neurons in the same layer and swap them, we have changed in some sense the computation, but the algorithms are also isomorphic in a sense. Perhaps we can modify the training procedure such that one of these two parameter configurations is preferred. And in general, we could make it such that we always converge to one specific "ordering of neurons" no matter the initialization. E.g. make it such that in each layer, the neurons are sorted based on the sum of the input weights of a neuron. We want to do something like make "isomorphic computations" always converge to one specific parameter configuration
If this project would go really well, we would get out tools that allow us to create visualization, which allows us to read off if certain kinds of <algorithms/structures/types of computation> are present in the neural network. The hope is that in the visualization you could see, for example, if the network is modeling other agents, if it is running computations that are correlated with thinking about how to deceive, if it is doing optimization, or if it is executing a search algorithm.
AI
Rationality
Sam Harris
Youtube
TV
Music
Vocaloid
Here is what I would do, in the hypothetical scenario, where I have taken over the world.
Though this is what I would do in any situation really. It is what I am doing right now. This is what I breathe for, and I won't stop until I am dead.
[EDIT 2023-03-01_17-59: I have recently realized that is is just how one part of my mind feels. The part that feels like me. However, there are tons of other parts in my mind that pull me in different directions. For example, there is one part that wants me to do lots of random improvements to my computer setup, which are fun to do, but probably not worth the effort. I have been ignoring these parts in the past, and I think that their grip on me is stronger because I did not take them into account appropriately in my plans.]
I have a heuristic to evaluate topics to potentially write about where I especially look for topics to write about that usually people are averse to writing about. It seems that topics that score high according to this heuristic might be good to write about as they can yield content with high utility compared to what is available, simply because other content of this kind (and especially good content of this kind) is rare.
Somebody told me that they read some of my writing and liked it. They said that they liked how honest it was. Perhaps writing about topics that are selected with this heuristic tends to invoke that feeling of honesty. Maybe just by being about something that people normally don't like to be honest about, or talk about at all. That might at least be part of the reason.
[MENTOR] Join My Brain in thinking about AGI AGI notkilleveryoneism
I am working on AGI notkilleveryoneism. I am good at generating lots of ideas. And I am good at going out of distribution with these ideas. That means I generate a lot of garbage ideas, but sometimes pretty good ones. To see some of them, see the "AGI notkilleveryoneism Interests" section in my bio.
I am interested in having somebody join my brain in thinking. That mainly involves being together, and then understanding a problem better, generating solutions, validating solutions, and implementing solutions. A major component would be keeping our brains in sync through effectively loading each other's models and ideas. In the ideal case, we are together in the same room, and the room contains a giant whiteboard.
I have done something related when I was studying game design, and empirically it made me much more productive. Since SERI MATS 3, John works together with David Lorell and he says it increases his productivity by 3-4x, which I totally buy. One possible structure that might try out:
UDVI X stands for the iterative process of: (U: first understand the problem domain of X (i.e. Hold Off On Proposing Solutions), D: design a solution, I: implement the solution)
I would probably do something different if I thought about it longer than 5 minutes, but I hope it communicates the rough idea.
Note that e.g. UDVI of "generating ideas" is already going meta. I imagine this process to be focused as much on the object level as possible, and only jump to the meta-level when you get stuck at the object level (understanding a problem domain better counts as object level in my book). So I am imagining something where you start by thinking about "UDVI how to work well together" but only spend some small fixed amount of time per week on that. Unless there is some problem that comes up. Mostly I am imagining working on the object level and applying UDVI to any problems that come up. Though initially, I think it makes to especially focus on meta.
I have more detailed ideas about specific steps and general strategies than those outlined above. There is some basic stuff that I expect to be good, such as: learning to say ops, I am wrong, I am confused, I don't understand, and other things in this category. Also having social norms facilitating that seems beneficial.
Some things I am thinking of now might break though. Either because I generated them before I knew about Hold Off On Proposing Solutions, or because I imported them from game design, which is a very different domain.
I also think there is tons of good stuff in John's MATS Models post. Talking as somebody who experienced all that in person during SERI MATS 2 and the SERI MATS 3 training phase.
I don't expect that all the time would be spent working together. I expect sometimes it makes sense to split up a task, or make you figure out something for yourself.
Also, there are probably some good <skill-up/exercise> things that I might direct you towards. E.g. Nate's Giant textfile exercise
I am missing many technical concepts, which I think is my biggest constraint right now. There is a good chance that you have many useful concepts, especially in math, that I don't. See my LessWrong Bio for a list of my <skills/technical concepts>.
Also, see my LessWrong Bio for more general information about me.
Epistemic Alert Beep Beep
Today I observed a curious phenomenon. I was in the kitchen. I had covered more than a square meter of the kitchen table in bags of food.
Then somebody came in and said, "That is a lot of food". My brain thought it needs to justify itself, and without any conscious deliberation I said "I went to the supermarket hungry, that is why I bought so much". The curious thing is that is completely wrong. Maybe it actually was a factor, but I did not actually evaluate if that was true. Anecdotally this seems to be a thing that happens, so it is a very plausible, even probable explanation.
My epistemics response: . . . "Ahhhhh". My statement was generated by an algorithm that optimized for what would be good to say taking into account only the social context. Trying to justify myself to that particular person. It was not generated by analyzing reality. And the worst thing is that I did this automatically, without thinking. And came close to not even noticing that this is going on.
New plan. Have an alarm-word that I can say out loud when this happens. This would then naturally lead to me reexplaining myself to the other person. Also, it would probably help with focusing on whatever caused the alert. It could help as a signal to yourself, that now it is time to investigate this, and it would also provide social justification to go on a brief sidetrack during a conversation. Maybe this won't work but seems worth trying. I'd like to avoid this happening again. How about "Epistemic Alert Beep Beep"? If you want, let's say it together ten times to better remember. Ideally visualizing yourself doing an epistemic misstep. Or even better, do an epistemic misstep (without forcing it too hard) and then catch yourself:
"Epistemic Alert Beep Beep"
"Epistemic Alert Beep Beep"
"Epistemic Alert Beep Beep"
"Epistemic Alert Beep Beep"
"Epistemic Alert Beep Beep"
"Epistemic Alert Beep Beep"
"Epistemic Alert Beep Beep"
"Epistemic Alert Beep Beep"
"Epistemic Alert Beep Beep"
"Epistemic Alert Beep Beep"
Haha, just kidding. Laugh your ass off, even when you know you are going to die.
Funnily enough, I read your bio just a couple of days ago. I very much like the interspersed poetry. These parts especially captured my attention in a good way:
Don't get yourself in denial thinking it's impossible to predict, just get arrogant and try to understand
Please critique eagerly - I try to accept feedback/Crocker's rules but fail at times - I aim for emotive friendliness but sometimes miss. I welcome constructive crit, even if ungentle, and I'll try to reciprocate kindly.
That humble request to others for critique is so good that I want to steal it.
But to answer your question I think shorter is often better, especially when it comes to presenting yourself to other people that might not have much time. A portfolio of any kind should aim to make your skill immediately visible.
Though the number of words might just be the wrong metric to begin with. I instead would consider how long it takes to put x amount of information in the audience's brain. They should gain large amounts of "knowledge" quickly. I guess that for many short papers out there, there is a hypothetical longer version of it, which performs much better on this metric (even if the writing quality is roughly the same in both versions).
In the bio, I wasn't optimizing for the minimum number of words. Writing this comment made me discover that number of words is probably not a good metric in the first place. Thank you for making me realize that.
I just wrote about what felt right. I feel like that worked out pretty well. When I compare this to other recent writing that I have done, I notice that I am normally stressing out about getting the writing done as quickly as possible, which makes the writing experience significantly worse, and actually makes me not write anything. That is, at least in part, the reason why I have only one mediocre AF post.
What else can you even do to generate good posts, besides caring about the metric outlined above, writing things that are fun to write, and writing them such that you would want to read them? Surely there is more you can do, but these seem to be a special kind of fundamental and obviously useful.
Ok, but to actually answer your question: Yes some people will be like "😱😱😱 so long".
I just released a major update to my LessWrong Bio. This is version 3. I have rewritten almost everything and added more stuff. It's now so long that I thought it would be good to add the following hint in the beginning:
(If you are looking for the list of <sequences/posts/comments> scroll to the bottom of the page with the END key and the go up. This involves a lot less scrolling.)
Kind of hilarious. Now I am wondering if I have the longest bio on LessWrong.
Here is a model of mine, that seems related.
[Edit: Add Epistemic status]
Epistemic status: I have used this successfully in the past and found it helpful. It is relatively easy to do. utilitytime_investment is large for me.
I think it is helpful to be able to emotionally detach yourself from your ideas. There is an implicit "concept of I" in our minds. When somebody criticizes this "concept of I", it is painful. If somebody says "You suck", that hurts.
There is an implicit assumption in the mind that this concept of "I" is eternal. This has the effect, that when somebody says "You suck", it is actually more like they say "You sucked in the past, you suck now, and you will suck, always and ever".
In order to emotionally detach yourself from your ideas, you need to sever the links in your mind, between your ideas and this "concept of I". You need to see an idea as an object that is not related to you. Don't see it as "your idea", but just as an idea.
It might help to imagine that there is an idea-generation machine in your brain. That machine makes ideas magically appear in your perception as thoughts. Normally when somebody says "Your idea is dumb", you feel hurt. But now we can translate "Your idea is dumb" to "There is idea-generating machinery in my brain. This machinery has produced some output. Somebody says this output is dumb".
Instead of feeling hurt, you can think "Hmm, the idea-generating machinery in my brain produced an idea that this person thinks is bad. Well maybe they don't understand my idea yet, and they criticize their idea of my idea, and not actually my idea. How can I make them understand?" This thought is a lot harder to have while being busy feeling hurt.
Or "Hmm, this person that I think is very competent thinks this idea is bad, and after thinking about it I agree that this idea is bad. Now how can I change the idea-generating machinery in my brain, such that in the future I will have better ideas?" That thought is a lot harder to have when you think that you yourself are the problem. What is that even supposed to mean that you yourself are the problem? This might not be a meaningful statement, but it is the default interpretation when somebody criticizes you.
The basic idea here is, to frame everything without any reference to yourself. It is not me producing a bad plan, but some mechanism that I just happened to observe the output of. In my experience, this not only helps alleviate pain but also makes you think thoughts that are more useful.