NaiveTortoise's Short Form Feed

by NaiveTortoise1 min read11th Aug 201896 comments

In light of reading Hazard's Shortform Feed -- which I really enjoy -- based on Raemon's Shortform feed, I'm making my own. There be thoughts here. Hopefully, this will also get me posting more.

96 comments, sorted by Highlighting new comments since Today at 2:17 PM
New Comment
Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

Anki's Not About Looking Stuff Up

Attention conservation notice: if you've read Michael Nielsen's stuff about Anki, this probably won't be new for you. Also, this is all very personal and YMMV.

In a number of discussions of Anki here and elsewhere, I've seen Anki's value measured in terms of time saved by not having to look stuff up. For example, Gwern's spaced repetition post includes a calculation of when it's worth it to Anki-ize threshold, although I would be surprised if Gwern hasn't already thought about the claim going to make.

While I occasionally use Anki to remember things that I would otherwise have to Google, e.g. statistics, I almost never Anki-ize things so that I can avoid Googling them in the future. And I don't think in terms of time saved when deciding what to Anki-ize.

Instead, (as Michael Nielsen discusses in his posts) I almost always Anki-ize with the goal of building a connected graph of knowledge atoms about an area in which I'm interested. As a result, I tend to evaluate what to Anki-ize based on two criteria:

  1. Will this help me think about this domain without paper or a computer better?
  2. In the Platonic graph of this domain's knowledge ontology, how central i
... (read more)
5riceissa1yI briefly looked at gwern's public database [https://www.gwern.net/Spaced-repetition#see-also] several months ago, and got the impression that he isn't using Anki in the incremental reading/learning way that you (and Michael Nielsen) describe. Instead, he seems to just add a bunch of random facts. This isn't to say gwern hasn't thought about this, but just that if he has, he doesn't seem to be making use of this insight. I feel like the center often shifts as I learn more about a topic (because I develop new interests within it). The questions I ask myself are more like "How embarrassed would I be if someone asked me this and I didn't know the answer?" and "How much does knowing this help me learn more about the topic or related topics?" (These aren't ideal phrasings of the questions my gut is asking.) In my experience, I often still forget things I've entered into Anki either because the card was poorly made or because I didn't add enough "surrounding cards" to cement the knowledge. So I've shifted away from this to thinking something more like "at least Anki will make it very obvious if I didn't internalize something well, and will give me an opportunity in the future to come back to this topic to understand it better instead of just having it fade without detection". I'm confused about what you mean by this. (One guess I have is big-O notation, but big-O notation is not sensitive to constants, so I'm not sure what the 5 is doing, and big-O notation is also about asymptotic behavior of a function and I'm not sure what input you're considering.) I think there are few well-researched and comprehensive blog posts, but I've found that there is a lot of additional wisdom the spaced repetition community has accumulated, which is mostly written down in random Reddit comments and smaller blog posts. I feel like I've benefited somewhat from reading this wisdom (but have benefited more from just trying a bunch of things myself). For myself, I've considered writing up wh
3NaiveTortoise1yThose seem like good questions to ask as well. In particular, the second one is something I ask myself although, similar to you, in my gut more than verbally. I also deal with the "center shifting" by revising cards aggressively if they no longer match my understanding. I even revise simple phrasing differences when I notice them. That is, if I repeatedly phrase the answer to a card one way in my head and have it phrased differently on the actual card, I'll change the card. I think both this and the original motivational factor I described apply for me. You're right. Sorry about that... I just heinously abuse big-O notation and sometimes forget to not do it when talking with others/writing. Edited the original post to be clearer ("on the order of 10"). Interesting, I've perused the Anki sub-reddit a fair amount, but haven't found many posts that do what I'm looking for, which is both give good guidelines and back them up with specific examples. This [https://www.reddit.com/r/Anki/comments/43mf83/guide_how_to_anki_maths_the_right_way/] is probably the closest thing I've read to what I'm looking for, but even this post mostly focuses on high level recommendations and doesn't talk about the nitty-gritty such as different types of cards for different types of skills. If you've saved some of your favorite links, please share! I agree that trying stuff myself has worked better than reading. Regarding other topics being more important, I admit I mostly wrote up the above because I couldn't stop thinking about it rather than based on some sort of principled evaluation of how important it would be. That said, I personally would get a lot of value out of having more people write up detailed case reports of how they've been using Anki and what does/doesn't work well for them that give lots of examples. I think you're right that this won't necessarily be helpful for newcomers, but I do think it will be helpful for people trying to refine their practice over long periods o
3riceissa1y* I like CheCheDaWaff's comments on r/Anki; see here [https://www.google.com/search?q=site%3Areddit.com%2Fr%2FAnki+%22CheCheDaWaff%22+math] for a decent place to start. In particular, for proofs, I've shifted toward adding "prove this theorem" cards rather than trying to break the proof into many small pieces. (The latter adheres more to the spaced repetition philosophy, but I found it just doesn't really work.) * Richard Reitz has a Google doc [https://docs.google.com/document/d/1Qu21SMy0DgQzYQBt1jCi416xeK6A-8eg84WA-kqamSM/edit] with a bunch of stuff. * I like this forum comment [http://web.archive.org/web/20190310050412/https://forum.koohii.com/thread-2275-post-134811.html#pid134811] (as a data point, and as motivation to try to avoid similar failures). * I like https://eshapard.github.io [https://eshapard.github.io] * Master How To Learn [https://masterhowtolearn.wordpress.com] also has some insights but most posts are low-quality. One thing I should mention is that a lot of the above links aren't written well. See this Quora answer [https://www.quora.com/What-are-the-most-underrated-life-skills/answer/Alex-K-Chen] for a view I basically agree with. I agree that thinking about this is pretty addicting. :) I think this kind of motivation helps me to find and read a bunch online and to make occasional comments (such as the grandparent) and brain dumps [https://github.com/riceissa/issarice.com/blob/master/drafts/spaced-repetition.md] , but I find it's not quite enough to get me to invest the time to write a comprehensive post about everything I've learned.
3NaiveTortoise7moSo... I just re-read your brain dump post and realized that you described an issue that I not only encountered but the exact example for which it happened! I indeed have a card for Newton's approximation but didn't remember this fact! That said, I don't know whether I would have noticed the connection had I tried to re-prove the chain rule, but I suspect not. The one other caveat is that I created cards very sparsely when I reviewed calculus so I'd like to think I might have avoided this with a bit more card-making.
3riceissa7moI want to highlight a potential ambiguity, which is that "Newton's approximation" is sometimes used to mean Newton's method [https://en.wikipedia.org/wiki/Newton%27s_method] for finding roots, but the "Newton's approximation" I had in mind is the one given in Tao's Analysis I, Proposition 10.1.7, which is a way of restating the definition of the derivative. (Here [https://www.math.ucla.edu/~tao/resource/general/131ah.1.03w/week78.pdf#page=21] is the statement in Tao's notes in case you don't have access to the book.)
3NaiveTortoise7moAh that makes sense, thanks. I was in fact thinking of Newton's method (which is why I didn't see the connection).
3TurnTrout1yAlthough I haven't used Anki for math, it seems to me like I want to build up concepts and competencies, not remember definitions. Like, I couldn't write down the definition of absolute continuity, but if I got back in the zone and refreshed myself, I'd have all of my analysis skills intact. I suppose definitions might be a useful scaffolding?
3NaiveTortoise1yYou're right on both accounts. Maybe I should've discussed this in my original post... At least for me, Anki serves different purposes at different stages of learning. Key definitions tend to be useful in the early stages, especially if I'm learning something on and off, as a way to prevent myself from having to constantly refer back and make it easier to think about what they actually mean when I'm away from the source. E.g., I've been exploring alternate interpretations of d-separation in my head during my commute and it helps that I remember the precise conditions in addition to having a visual picture. Once I've mastered something, I agree that the "concepts and competencies" ("mental moves" is my preferred term) become more important to retain. E.g., I remember the spectral theorem but wish I remembered the sketch of what it looks like to develop the spectral theorem from scratch. Unfortunately, I'm less clear/experienced on using Anki to do this effectively. I think Michael Nielsen's blog post on seeing through a piece of mathematics [http://cognitivemedium.com/srs-mathematics] is a good first step. Deeply internalizing core proofs from an area presumably should help for retaining the core mental moves involved in being effective in that area. But, this is quite time intensive and also prioritizes breadth over depth. I actually did mention two things that I think may help with retaining concepts and competencies - Anki-izing the same concepts in different ways (often visually) and Anki-izing examples of concepts. I haven't experienced this yet, but I'm hopeful that remembering alternative visual versions of definitions, analogies to them, and examples of them may help with the types of problems where you can see the solution at a glance if you have the right mental model (more common in some areas than others). For example, I remember feeling (usually after agonizing over a problem for a while) like Linear Algebra Done Right had a lot of exercises where th

Cruxes I Have With Many LW Readers

There's a crux I seem to have with a lot of LWers that I've struggled to put my finger on for a long time but I think reduces to some combination of:

  • faith in elegance vs. expectation of messiness;
  • preference for axioms vs. examples;
  • identification as primarily a scientist/truth-seeker vs. as an engineer/builder.

I tend to be more inclined towards the latter in each case, whereas I think a lot of LWers are inclined towards the former, with the potential exception of the author of realism about rationality, who seems to have opinions that overlap with many of my own. While I still feel uncomfortable with the above binaries, I've now gathered enough examples to at least list them as evidence for what I'm talking about.

Example 1: Linear Algebra Textbooks

A few LWers have positively reviewed Linear Algebra Done Right (LADR), in particular complimenting it for revealing the inner workings of Linear Algebra. I too recently read most of this book and did a lot of the exercises. And... I liked it but seemingly less than the other reviewers. In particular, I enjoyed getting a lot of practice reading definition-theorem-proof style math and doing lots of

... (read more)

I have similar differences with many people on LW and agree there is something of an unacknowledged aesthetic here.

4jimrandomh1yI think the engineer mindset is more strongly represented here than you think, but that the nature of nonspecialist online discussion warps things away from the engineer mindset and towards the scientist mindset. Both types of people are present, but the engineer-mindset people tend not to put that part of themselves forward here. The problem with getting down into the details is that there are many areas with messy details to get into, and it's hard to appreciate the messy details of an area you haven't spent enough time in. So deep dives in narrow topics wind up looking more like engineer-mindset, while shallow passes over wide areas wind up looking more like scientist-mindset. LessWrong posts can't assume much background, which limits their depth. I would be happy to see more deep-dives; a lightly edited transcript of John Carmack wouldn't be a prototypical LessWrong post, but it would be a good one. But such posts are necessarily going to exclude a lot of readers, and LessWrong isn't necessarily going to be competitive with posting in more topic-specialized places.
3NaiveTortoise1yThese are all good points. After I saw that Benito did a transcript post, I considered doing one for one of Carmack's talks or a recent interview of Yann LeCunn I found pretty interesting (based on the talks of his I've listened to, LeCunn has a pretty engineering-y mindset even though he's nominally a scientist). Not going to happen immediately though since it requires a pretty big time investment. Alternatively, maybe I'll review Masters of Doom [https://www.amazon.com/dp/B000FBFNL0/ref=dp-kindle-redirect?_encoding=UTF8&btkr=1] , which is where I learned most of what I know about Carmack.
3Pattern1yAs the dichotomy isn't jumping out at me, I guess I should read both of those books* sometime and see which I like more. *Linear Algebra Done Right (LADR) Shilov's Linear Algebra [https://cosmathclub.files.wordpress.com/2014/10/georgi-shilov-linear-algebra4.pdf]
3Ruby1yThis is really interesting, I'm glad you wrote this up. I think there's something to it. Some quick comments: * I generally expect there to exist simple underlying principles in most domains which give rise to messiness (and often the messiness seems a bit less messy once you understand them). Perceiving "messiness" does also often feel to me like lack of understanding whereas seeing the underlying unity makes me feel like I get whatever the subject matter is. * I think I would like it if LessWrong had more engineers/inventors as role models and that it's something of an oversight that we don't. Yet I also feel like John Carmack probably probably isn't remotely near the level of Pearl (I'm not that familiar Carmack's work): pushing forward video game development doesn't compare to neatly figuring what exactly causality itself is. * There might be something like all truly monumental engineering breakthroughs depended on something like a "scientific" breakthrough. Something like Faraday and Maxwell figuring out theories of electromagnetism is actually a bigger deal than Edison(/others) figuring out the lightbulb, the radio, etc. There are cases of lauded people who are a little more ambiguous on the science/engineer dichotomy. Turing? Shannon? Tesla? Shockley et al with the transistor seems kind of like an engineering breakthrough, and seems there could be love for that. I wonder if Feynman gets more recognition because as an educator we got a lot more of the philosophy underlying his work. Just rambling here. * A little on my background: I did an EE degree which was very practical focus. My experience is that I was taught how to do apply a lot of equations and make things in the lab, but most courses skimped on providing the real understanding that left me overall worse as an engineer. The math majors actually understood Linear Algebra, the physicists actually understood electromagnetism, and
9jimrandomh1yYou're looking at the wrong thing. Don't look at the topic of their work; look at their cognitive style and overall generativity. Carmack is many levels above Pearl. Just as importantly, there's enough recorded video of him speaking unscripted that it's feasible to absorb some of his style.
2Ruby1yBy generativity do you mean "within-domain" generativity? To unpack which "levels" I was grading on, it's something like a blend of "importance and significance of their work" / "difficulty of the problems they were solving", admittedly that's still pretty vague. On those dimensions, it seems entirely fair to compare across topics and assert that Pearl was solving more significant and more difficult problem(s) than Carmack. And for that "style" isn't especially relevant. (This can also be true even if Carmack solved many more problems.) But I'm curious about your angle - when you say that Carmack is many levels above Pearl, which specific dimensions is that on (generativity and style?) and do you have any examples/links for those?
2jimrandomh1yNot exactly, because Carmack has worked in more than one domain (albeit not as successfully; Armadillo Aerospace never made orbit.) Agree on significance, disagree on difficulty.
1NaiveTortoise1yIn an interesting turn of events, John Carmack announced today [https://mobile.twitter.com/ID_AA_Carmack/status/1194754916293722114] that he'll be pivoting to work on AGI.
5mr-hire1yTRIZ is an engineering discipline that has something called the five levels of innovation, which talks about this: 1. You solve a problem by using a common solution in your own speciality. 2. You solve a problem using a common solution i your own industry. 3. You solve a problem using a common solution found in other industries. 4. You solve a problem using a solution built on first principles (e.g. little known scientific principles.) 5. You solve a problem by discovering a new principle/scientific rule.
2Ruby1ySeems you're referring to this https://en.wikipedia.org/wiki/TRIZ [https://en.wikipedia.org/wiki/TRIZ?]?
2mr-hire1yYes.
2NaiveTortoise1yThanks for your reply! I agree with a lot of what you said. First off, thanks for bringing up the point about underlying principles. I agree that there are often underlying principles in many domains and that I also really like to find unity in seeming messiness. I used to be of the more extreme view that principles were in some sense more important than the details, but I've become more skeptical over time for two reasons. 1. From a pedagogy perspective, I've personally never had much luck learning principles without having a strong base of practice & knowledge. That said, when I have that base, learning principles helps me improve further and is satisfying. 2. I've realized over time how much of action (where action can include thinking) is based upon a set of non-verbal strategies that one learns through practice and experimentation even in seemingly theoretical domains. These strategies seem to be the secret sauce that allow one to act fluently but seem meaningfully different from the types of principles people often discuss. Another way to phrase my argument is that principles are important but very hard to transfer between minds. It's possible you agree and I'm just belaboring the point but I wanted to make it explicit. One concrete example of the distinction I'm drawing is something called the "What Are Monads Fallacy" [https://two-wrongs.com/the-what-are-monads-fallacy] in the Haskell community where people try to explain monads by conveying their understanding of what mondas really are even though they learned about monads by just using them a bunch which lead to them later developing a higher level understanding of them. This reflects a more general problem where experts often struggle to teach to novices because they don't realize that their broad understanding is actually founded upon lower level understanding of a lot of details. I tentatively agree, but it's pretty hard to draw comparisons. From an insight persp
2Ruby1ySorry for the delayed reply on this one. I do think we agree on rather a lot here. A few thoughts: 1. Seems there are separate questions of "how you model/role-models and heroes/personal identity" and separate questions of pedagogy. You might strongly seek unifying principles and elegant theories but believe the correct way to arrive at these and understand these is through lots of real-world messy interactions and examples. That seems pretty right to me. 2. Your examples in this comment do make me update on the importance of engineering types and engineering feats. It makes me think that indeed LessWrong too much focuses only on heroes of "understanding" when there are heroes "of making things happen" which is rather a key part of rationality too. A guess might be that this is down-steam of what was focused on in the Sequences and the culture that set. If I'm interpreting Craft and the Community [https://www.lesswrong.com/s/pvim9PZJ6qHRTMqD3/p/aFEsqd6ofwnkNqaXo] correctly, Eliezer never saw the Sequences as covering all of rationality or all of what was important, just his own particular sub-art that he created in the course of trying to do one particular thing. Seemingly answering is confused questions is more science-y than engineering-y and would place focus on great scientists like Feynman. Unfortunately, the community has not yet supplemented the Sequences with the rest of the art of human rationality and so most of the LW culture is still downstream of the Sequences alone (mostly). Given that, we can expect the culture is missing major key pieces of what would be the full art, e.g. whatever skills are involved in being Jeff Dean and John Carmack. About that you might be correct. Personally, I do think I enjoy theory even without application. I'm not sure if my mind secretly thinks all topics will find their application, but having applications (beyond what is needed to understand) doesn't feel key to my interest, so something.
9NaiveTortoise1yAt this point, I basically agree that we agree and that the most useful follow up action is for someone (read: me) to actually be the change they want to see and write some (object-level), and ideally good, content from a more engineering-y bent. As I mentioned in my reply to jimrandomh, a book review seems like a good place for me to start.
2Ruby1yCool. Looking forward to it!

Weird thing I wish existed: I wish there were more videos of what I think of as 'math/programming speedruns'. For those familiar with speedrunning video games, this would be similar except the idea would be to do the same thing for a math proof or programming problem. While it might seem like this would be quite boring since the solution to the problem/proof is known, I still think there's an element of skill to and would enjoy watching someone do everything they can to get to a solution, proof, etc. as quickly as possible (in an editor, on paper, LaTex, etc.).

This is kind of similar to streaming ACM/math olympiad competition solving except I'm equally more in people doing this for known problems/proofs than I am for tricky but obscure problems. E.g., speed-running the SVD derivation.

While I'm posting this in the hope that others are also really interested, my sense is that this would be incredibly niche even amongst people who like math so I'm not surprised it doesn't exist...

5mr-hire8moI'm not super familiar with the competitive math circuit, but my understanding is that this is part of it? People are given a hard problem and either individually or as a team solve it as quickly as possible.
8habryka8moDo you know of any videos on this? Ideally while the person is narrating their thoughts out loud.
3NaiveTortoise8mo3Blue1Brown has a video [https://www.youtube.com/watch?v=OkmNXy7er84] where he sort of does this for a hard Putnam problem. I say "sort of" because he's not solving the problem in real time so much as retrospectively describing how one might solve it.
2habryka8moYeah, that is one of my favorite videos by 3Blue1Brown and more like it would be pretty good.
1NaiveTortoise8moYep, I touched on this above. Personally, I'm less interested in this type of problem solving than I am in seeing someone build to a well-known but potentially easier to prove theorem, but I suspect people solving IMO problems would appeal to a wider audience.
4riceissa6moSomewhat related: https://xenaproject.wordpress.com/2020/05/23/the-complex-number-game/ [https://xenaproject.wordpress.com/2020/05/23/the-complex-number-game/]
1NaiveTortoise6moThis is awesome! I've been thinking I should try out the natural number game for a while because I feel like formal theorem proving will scratch my coding / video game itch in a way normal math doesn't.
4riceissa7moI had a similar idea which was also based on an analogy with video games (where the analogy came from let's play videos rather than speedruns), and called it a live math video [https://learning.subwiki.org/wiki/Live_math_video].
1NaiveTortoise7moCool, I hadn't seen your page previously but our ideas do in fact seem very similar. I think you were right to not focus on the speed element and instead analogize to 'let's play' videos.
3NaiveTortoise8moRelated: here [https://cr.yp.to/papers/calculus.pdf], DJB lays out the primary results of a single-variable calculus course in 11 LaTex-ed pages.
3AprilSR8moThe problem with this is that it is very difficult to figure out what counts as a legitimate proof. What level of rigor is required, exactly? Are they allowed to memorize a proof beforehand? If not, how much are they allowed to know?
3Pattern8moSolutions might be better to go with than proofs - if the answer is wrong, that's more straightforward to show that whether or not a proof is wrong.
3NaiveTortoise8moYeah what would be ideal is if theorem provers were more usable and then this wouldn't be an issue (although of course there's still the issue of library code vs. from scratch code but this seems easier to deal with). Memorizing a proof seems fine (in the same way that I assume you end up basically memorizing the game map if you do a speedrun).
1MakoYass7moI have a friend who might be into programming speedrunning https://merveilles.town/@cancel/104005117320841920 [https://merveilles.town/@cancel/104005117320841920]
1NaiveTortoise7moSeems like the post you linked is a joke. Were you serious about the friend?
2MakoYass7moSerious in that I mean he might, I'd say, 0.1 that he'd be interested, but if that's not negligible, I think if he took it up he'd be very good at it. I'll ask him.
1NaiveTortoise7moCool!

Watching my kitten learn/play has been interesting from a "how do animals compare to current AIs perspective?" At a high level, I think I've updated slightly towards RL agents being further along the evolutionary progress ladder than I'd previously thought.

I've seen critiques of RL agents not being able to do long-term planning as evidence for them not being as smart as animals, and while I think that's probably accurate, I have noticed that my kitten takes a surprisingly long time to learn even 2-step plans. For example, when it plays with a toy on a string, I'll often try putting the toy on a chair that it only knows how to reach by jumping onto another chair first. It took many attempts before it learned to jump onto the other chair and then climb to where I'd put the toy, even though it had previously done that while exploring many times. And even then, it seems to be at risk of "catastrophic forgetting" where we'll be playing in the same way later and it won't remember to do the 2-step move. Related to this, its learning is fairly narrow even for basic skills, e.g. I have 4 identical chairs around a table but it will be afraid of jumping onto one even though it's very comforta

... (read more)

I keep seeing rationalist-adjacent discussions on Twitter that seem to bottom out with the arguments of the general (very caricatured, sorry) form: "stop forcing yourself and get unblocked and then X effortlessly" where X equals learn, socialize, etc. In particular, a lot of focus seems to be on how children and adults can just pursue what's fun or enjoyable if they get rid of their underlying trauma and they'll naturally learn fast and gravitate towards interesting (but also useful in the long term) topics, with some inspiration from David Deutsch.

On one hand, this sounds great, but it's so foreign to my experience of learning things and seems to lack the kind of evidence I'd expect before changing my cognitive strategies so dramatically. In fairness, I probably am too far in the direction of doing things because I "should", but I still don't think going to the other extreme is the right correction.

In particular, having read Mason Currey's Daily Rituals, I have a strong prior that even the most successful artists and scientists are at risk of developing akrasia and need to systematize their schedules heavily to ensure that they get their butts in the chair and work. Given this, wh... (read more)

9G Gordon Worley III4moUnblocking motivation is only enough on its own if the motivation is so strong that you feel "hungry" to do something. Long term this kind of hunger is, in my experience, unreliable, so it's not enough just to unblock your ability to do things. You also have to set up the conditions for your motivation to express itself, e.g. through daily rituals as you suggest. For example, a big problem people I talk to had to deal with when shelter-in-place orders hit was that they lost their daily rituals and had to establish new ones. It wasn't that they didn't want to work or do other things they normally do, it was that they lost the normal context in which they did them, and had to establish new contexts in which they expected to find themselves doing the intended activity. Trying to force yourself to do things is like setting up the conditions without unblocked motivation. So I think both things are required, but only one thing is the bottleneck at a time, thus lots of people need advice on one part and not the other at any given moment, creating evidence though that can look like all you need to do is fix one thing and everything else will follow.
6Raemon4moI originally had your experience, and have seen enough people claim to get unblocked that there seems to be at least something to it. At the very least, if you have crippling depression, solving that is often higher impact than incremental skill growth. I wrote up more thoughts about this here [https://www.lessestwrong.com/posts/85J8hjEn48FicYfvp/strategies-of-personal-growth] .
3NaiveTortoise4moThanks for replying and sharing your post. I'd actually read it a while ago but forgotten how relevant it is to the above. To be clear, I totally buy that if you have crippling depression or even something more mild, fixing that is a top priority. I also have enjoyed recent posts on and think I understand the alignment-based models of getting all your "parts" on board. Where I get confused and where I think there's less evidence is that the unblocking can make it such that doing hard stuff is no longer "hard". Part of what's difficult here is that I'm struggling to find the right words but I think it's specifically claims of effortlessness or fun that seem less supported to me.
2Viliam4moGenerally, you have to solve the problem you have. (Related: Anna Karenina principle [https://en.wikipedia.org/wiki/Anna_Karenina_principle].) If your problem happens to be some trauma, fix the trauma. If it is lack of tools, buy the right tools. If it is wasting time on social networks, install a web blocker. And if it's just than you never prioritize doing X, but in retrospective always wish you had, precommit to spend some time doing X. Of course, it could be more of those things together; maybe you have a trauma and also lack the right tools. Then you must solve both. Maybe one is more visible, and you only realize the other after fixing the first one.
1NaiveTortoise4moThis is basically my perspective but seems contrary to the perspective in which most problems are caused by internal blockages, right?
7Viliam4moYep. The idea that everything is caused by internal conflicts, and if we only could resolve all the internal conflicts (which might take a few years of hard work, if you want to do it thoroughly) we would become amazing supermen (so all those years spent on therapy would still be totally worth it), originates from Freud. It is my long-term source of amusement, that if you mention Freud of psychoanalysis in the rationalist community, you reliably get "pseudoscience", "it's completely debunked", et cetera... but if you rephrase the same ideas using modern language, without mentioning the source, they become accepted rationalist wisdom.
3Hazard4moOne way I think about things. Everything that I've found in myself and close friends that looks and smells like "shoulds" is sorta sneaky. I keep on finding shoulds which seem have been absorbed from others and are less about "this is a good way to get a thing in the world that I want" and "someone said you need to follow this path and I need them to approve of me". The force I feel behind my shoulds is normally "You SCREWED if you don't!" a sort of vaguely panicy, inflexible energy. It's rarely connected to the actual good qualities of the thing I "should" be doing. Because my shoulds normally ground out in "if I'm not this way, people won't like me", if the pressure get's turned up, following a should takes me farther and farther away from things I actually care about. Unblocking stuff often feels like transcending the panicy fear that hides behind a should. It never immediately lets me be awesome at stuff. I still need to develop a real connection to the task and how it works into the rest of my life. There's still drudgery, but it's dealt with from a calmer place.
1NaiveTortoise4moYes I can relate to this!
3mr-hire4moI think removing internal conflicts is a "powerful but not sufficient." The people who are most productive are also great at amplifying external conflicts. That is, they have a clear, strong vision, and amplify the creative tension between what they have and know they can have. This can help you do things that are not "fun" like deliberate practice. but are totally aligned, in that you have no objections to doing them, and have a stance of acceptance towards the things that are not enjoyable. The best then augment that with powerful external structures that are supportive of their ideal internal states and external behaviors. Each one of these taken far enough can be powerful, and when combined together they are more than the sum of their parts.
1NaiveTortoise4moThanks, this framing is helpful for me for understanding how these things can be seen to fit together.

Interesting Bill Thurston quote, sadly from his obituary:

I’ve always taken a “lazy” attitude toward calculations. I’ve often ended up spending an inordinate amount of time trying to figure out an easy way to see something, preferably to see it in my head without having to write down a long chain of reasoning. I became convinced early on that it can make a huge difference to find ways to take a step-by-step proof or description and find a way to parallelize it, to see it all together all at once—but it often takes a lot of struggle to be able to do that. I think it’s much more common for people to approach long case-by-case and step-by-step proofs and computations as tedious but necessary work, rather than something to figure out a way to avoid. By now, I’ve found lots of “big picture” ways to look at the things I understand, so it’s not as hard.

To prevent mis-interpretation, I think people often look at quotes like this (I've seen similar ones about Feynman) and think "ah yes, see anyone can do it". But IME the thing he's describing is much harder to achieve than the "case-by-case"/"step-by-step" stuff.

I've recently been obsessing over the idea of trying to "make math more like programming". I'm not sure if it's just because I feel fluent at programming and still not very fluent at abstract math or also because programming really does have a feedback loop that you don't get in math.

Regardless, based on my reading it seems like there's a general consensus in math that even the most modern theorem provers, like Lean and Coq, are much less efficient than typical "informal" math reasoning. That said, I wonder if this ignores some of the benefits that program

... (read more)
4Pattern1yIt seems like a useful idea on a lot of levels. There's a difference between solving a problem where you're 1) trying to figure out what to do. 2) Executing an algorithm. 3) Evaluating a closed form solution (Plugging the values into the equation, performing the operations, and seeing what the number is.)*** Names. If you're writing a program, and you decide to give things (including functions/methods) names like the letters of the alphabet it's hard for other people to understand what you're doing. Including future you. As a math enthusiast I see the benefit of not having to generate names*, but teaching wise? I can see some benefits of merging/mixing. (What's sigma notation? It's a for loop.) Functions. You can say f' is the derivative of f. Or you can get into the fact that there are functions** that take other functions as arguments. You can focus narrowly on functions of one-variable. Or you can notice that + is a function that takes two numbers (just like *, /, ^). *Like when your idea of what you're doing /with something changes as you go and there's no refactoring tool on paper to change the names all at the last minute. (Though paper feels pretty nice to work with. That technology is really ergonomic.) **And that the word function has more than one meaning. There's a bit of a difference between a way of calculating something and a lookup table. ***Also, seeing how things generalize can be easier with tools that can automatically check if the changes you've made have broken what you were making. (Writing tests.)

Blockchain idea inspired by 80,000 Hours's interview of Vitalik Buterin: a lot of podcasts either have terrible transcriptions or presumably pay a service to transcribe their sessions. However, even these services make minor typos such as "ASX" instead of "ASICs" in the linked interview.

Now, most people who read these transcripts presumably notice at least a subset of these typos but don't want to go through the effort of emailing podcasters to tell them about it. On the flip side, there's no good way for hosts to scalabl... (read more)

I've been reading a bit about John Conway since his (unfortunate) death. One thing I keep noticing is that everyone seems to emphasize how core having fun was to John Conway's way of doing math. One question I'm interested in in general is how important fun and curiosity are for doing good research.

I've considered posting a question about this that uses John Conway as an example of someone who 1) was genuinely curious and fun-loving but 2) also had other gifts that played a large role in his ability to do great math. But, I don't want to be insensitive giv

... (read more)
3Ben Pace7moI also expect you'll get answers that are focused on his legacy if you ask that kind of question about him now. Feynman is the central example I think of for this, and there's a lot more published about and by him, so I'd suggest using him. (I think there is a strong connection between fun and curiosity and doing good research.)
1NaiveTortoise7moThanks for the feedback!

It seems like (unless I just haven't discovered it yet) there's a sore need for a framework for causal model comparison, analogous to Bayesian model comparison. If you read Pearl (and his students), they rightfully point out that you can't get causal claims without causal assumptions but don't talk much about how you actually formulate the causal model in the first place ("domain knowledge"). As a result, if you look at the literature, researchers mostly seem to use a small set of causal models that may or may not describe phenomena, e.g. the classic "inst

... (read more)
1NaiveTortoise10moI forgot to include the disclaimer besides statistical independence tests, which can invalidate graphs but are difficult in practice.

Epistemic status: Thinking out loud.

Introducing the Question

Scientific puzzle I notice I'm quite confused about: what's going on with the relationship between thinking and the brain's energy consumption?

On one hand, I'd always been told that thinking harder sadly doesn't burn more energy than normal activity. I believed that and had even come up with a plausible story about how evolution optimizes for genetic fitness not intelligence, and introspective access is pretty bad as it is, so it's not that surprising that we can't crank up our brains energy con

... (read more)
1eigen1yThe ESPN article [https://www.espn.com/espn/story/_/id/27593253/why-grandmasters-magnus-carlsen-fabiano-caruana-lose-weight-playing-chess] had a misleading title. They go on to say that a player burns 6000 calories a day , but Caruana [https://en.wikipedia.org/wiki/Fabiano_Caruana] runs an hour a day (or more). These Grandmasters are not reaching into some esoteric mental ability and burning more calories that way; if anyone has ever seen a Grandmaster play against many players at once, or blindfolded (or even blindfolded and against many players!) one can really understand that they see the board in a way that's pretty different from us. The classical theory for this is that they have formed bigger/better chunks than us from excessive playing (the very same way a Mathematician or a Basketball player does). Calorie consumption, is thus correlation in that specific context. Although, I think, a (weak) connection could be made between the use of Language and these chunks formations or using this chunks (who's to say this is not a specialized use of Language?) for the context of a tournament, but I have yet to see anything that support this idea.
1NaiveTortoise1yMy takeaway from the article was that, to your point, their brains weren't using more energy. Rather, the best hypothesis was just that their adrenal hormones remained elevated for many hours of the day, leading to higher metabolism during that period. Running an hour a day is definitely not enough to burn 6000 calories for the record (a marathon burns around 3500). Maybe I wasn't clear, but that's what I meant by the following.
1eigen1yGot it! then I agree with you. I think that a best description of my point would be that yeah, these guys are not burning calories by thinking better or harder. Their exercise plus the higher stress environment could account alone for their high amount burn of calories.

ML-related math trick: I find it easier to imagine a 4D tensor, say of dimensions , as a big matrix with dimensions within which are nested matrices of dimensions . The nice thing about this is, at least for me, it makes it easier to imagine applying operations over the matrices in parallel, which is something I've had to thing about a number of times doing ML-related programming, e.g. trying to figure out how write the code to apply a 1D convolution-like operation to an entire batch in parallel.

1crabman1yI've been studying tensor decompositions and approximate tensor formats for half a year. Since I've learned about tensor networks, I've noticed that I can draw them to figure out how to code some linear operations on tensors. Once I used this to figure out how to implement backward method of some simple neural network layer (not something novel, it was for the sake of learning how deep learning frameworks work). Another time I needed to figure out how to implement forward method for a Conv2d layer with weights tensor in CP format. After drawing its output as a tensor network diagram, it was clear that I could just do a sequence of 3 Conv2d layers: pointwise, depthwise, pointwise. I am not saying that you should learn tensor networks, it's probably a lot of buck for not too large bang unless you want to work with tensor decompositions and formats.
1NaiveTortoise1yFrom cursory Googling, it looks like tensor networks are mostly used for understanding quantum systems. I'm not opposed to learning about them, but is there a good resource you can point me to that introduces them independent of the physics concepts? Were you learning them for use in physics? For example, have you happened to read this Google AI paper [https://arxiv.org/abs/1905.01330] introducing their TensorNetworks library and giving an overview?
1crabman1yUnfortunately I don't know any quantum stuff. I learned them for machine learning purposes. A monograph by Cichocki et al. (part 1 [https://arxiv.org/abs/1609.00893], part 2 [https://arxiv.org/abs/1609.00893]) is an overview of how tensor decompositions, tensor formats, and tensor networks can be used in machine learning and signal processing. I think it lacks some applications, including acceleration and compression of neural networks by compression of weights of layers using tensor decompositions (this also sometimes improves accuracy, probably by reducing overfit). Tensor decompositions and Applications by Kolda, Bader 2009 [http://www.kolda.net/publication/koba09/] - this is an overview of tensor decompositions. It doesn't have many machine learning applications. Also it doesn't talk of tensor networks, only about some simplest tensor decompositions and specific tensor formats which are the most popular types of tensor networks. This paper was the first thing I read about all the tensor stuff, and it's one of the easier things to read. I recommend you read it first and then look at the topics that seem interesting to you in Cichocki et al. Tensor spaces and numerical tensor calculus - Hackbusch 2012 [http://gen.lib.rus.ec/book/index.php?md5=341024D46943ADC4FD8A49CD91694DBC] - this textbook covers mathematics of tensor formats and tensor decompositions for hilbert and banach spaces. No applications, a lot of math, functions analysis is kinda a prerequisite. Very dense and difficult to read textbook. Also doesn't talk of tensor networks, only about specific tensor formats. -------------------------------------------------------------------------------- Handwaving and interpretive dance [https://arxiv.org/abs/1603.03039] - This is simple, it's about tensor networks, not other tensor stuff. It's for physicists but chapter 1 and maybe other chapters can be read without physics background. -----------------------------------------------------------------------

How to remember everything (not about Anki)

In this fascinating article, Gary Marcus (now better known as a Deep Learning critic, for better or worse) profiles Jill Price, a woman who has an exceptional autobiographical memory. However, unlike others that studied Price, Marcus plays the role of the skeptic and comes to the conclusion that Price's memory is not exceptional in general, but instead only for the facts about her life, which she obsesses over constantly.

Now obsessing over autobiographical memories is not something I'd recommend to people, but re

... (read more)
7mr-hire7moI would love to see her cognitive strategy modeled in more depth. What are the beliefs and emotions that are sustaining that constant mulling?
1NaiveTortoise7moIt seems not that conscious. I suspect it's similar to very scrupulous people who just clean / tidy up by default. That said, I am very curious whether it's cultivatable in a less pathological way.

Sometimes there are articles I want to share, like this one, where I don't generally trust the author and they may have quite (what I consider) wrong views overall but I really like some of their writing. On one hand, sharing the parts I like without crediting the author seems 1) intellectually / epistemically dishonest and 2) unfair to the author. On the other hand, providing a lot of disclaimers about not generally trusting the author feels weird because I feel uncomfortable publicly describing why I find them untrustworthy.

Not really sure what to do her

... (read more)
4mr-hire7moOne thing to do could just be to add an "epistemic status" to articles you share, with some being like "interesting writing" or "made me think" and others being like "agree" or "seems basically correct"
1NaiveTortoise7moYeah good idea.

Taking Self-Supervised Learning Seriously as a Model for Learning

It seems like if we take self-supervised learning (plus a sprinkling of causality) seriously as key human functions, we can more directly enhance our learning by doing much more prediction / checking of predictions while we learn. (I think this is also what predictive processing implies but don't understand that framework as well.)

(Removed.)

[This comment is no longer endorsed by its author]Reply
7eigen1y*writing the movie right now* Relevant here: https://www.lesswrong.com/posts/bshZiaLefDejvPKuS/dying-outside [https://www.lesswrong.com/posts/bshZiaLefDejvPKuS/dying-outside]
-1agai1yComment removed for posterity.
0[anonymous]1yI have reported this comment. Hopefully the mods will remove it. Please don’t speculate on the identity of Satoshi, or spread speculation by others. It has led in multiple cases to people being stalked, blackmailed, harassed, and mugged. Posts like this put innocent lives in physical danger. Be responsible and keep this sort of thing off the Internet.
3NaiveTortoise1y(Note: responded quickly before removing. I've since edited this comment now that I have more time. Also I'm not the person who downvoted your post.) I definitely did not intend to cause anyone or their family danger (or harassment, etc.), so I've removed the post. Mostly in the selfish interest of showing that I wasn't being negligent, I did consider this risk before posting. That's why I noted that I have no information beyond what's already public and was taking into account that since I heard this speculation on a podcast which involved one relatively prominent cryptocurrency person (I won't say who so as not to publicize it further), it seemed unlikely that my post would add additional noise. All that said, I still agree that even a small chance of harm is more than enough reason to remove the post. Especially, since: 1. it seems like you're more involved in the crypto community than I and therefore probably have more context than I do on this topic; and 2. my own version of integrity includes not doing things that only don't cause bad outcomes because they're obscure (related to my second point above).
[-][anonymous]1y 11

Thank you. Yes it is a real problem, speaking from experience from the people I personally know. The reason these events are not talked about much is that any press just makes the problem worse—it gives a bunch of copycat muggers the same bright idea. So unfortunately you get a bunch of speculation and not a lot of observable evidence of the downsides of that speculation, so people don’t realize the harm that has been caused.

There are people who have been killed in attempted bitcoin muggings. Speculating on the Internet that someone is possession of >1 million bitcoins is like tattooing a big target on their back they can’t get rid of.

2NaiveTortoise1yThanks, that helps contextualize.
3eigen1yFor the record I'm one who downvoted Mark; I don't agree with him and I think it sad that you, an1lam, removed the original post which I don't think did any harm whatsoever (reasons should be pretty obvious, a random short-form post about an hypothetical movie somehow it's evidence that Hal was Satoshi? I do not think so at all.)
3[anonymous]1yThe risk to innocents is real. Physical security is a really hard problem for people in this space, and the police won’t protect those at risk. Does one post on one rationalist website really matter? Yes, for the same reason your vote matters at the ballot box. This is the collective action problem. If nobody self-censors a statement that puts people at risk, the risks only increase over time and those who help propagate the info are morally culpable.

Weird thought I had based on a tweet about gradient descent in the brain: it seems like one under-explored perspective on computational graphs is the causal one. That is, we can view propagating gradients through the computational graph as assessing the effect of an intervention on some variable on all of a nodes' children.

Reason to think this might be useful:

  • *Maybe* this can act as a different lens for examining NN training?

Reasons why this might not be useful:

  • It's not obvious that it makes sense to think of nodes in an NN (or any differenti
... (read more)

If algebra's a deal with the devil where you get the right answer but don't know why, then geometric intuition's a deal with the devil where you always get an answer but don't know whether it's right.

Someone should write the equivalent of TAOCP for machine learning.

(Ok, maybe not literally the equivalent. I mean Knuth is... Knuth. So it doesn't seem realistic to expect someone to do something as impressive as TAOCP. And yes, this is authority worship and I don't care. He's Knuth goddamn it.)

Specifically, a book where the theory/math's rigorous but the algorithms are described in their efficient forms. I haven't found this in the few ML books I've read parts of (Bishop's Pattern Recognition and Machine Learning, MacKay's Information Theory, and Tibrisha

... (read more)

Today I attended the first of two talks in a two-part mini-workshop on Variational Inference. It's interesting to think of from the perspective of my recent musings about more science-y vs. engineering mindsets because it highlighted the importance of engineering/algorithmic progress in widening Bayesian methods' applicability

The presenter, who's a fairly well known figure in probabilistic ML and has developed some well known statistical inference algorithms, talked about how part of the reason so much time was spent debating philosophical issues in the pa

... (read more)

Link post for a short post I just published describing my way of understanding Simpson's Paradox.

Thing I desperately want: tablet native spaced repetition software that lets me draw flashcards. Cloze deletions are just boxes or hand-drawn occlusions.

I'm interested in reading more about what might've been going on in Ramanujan's head when he did math. So far, the best thing I've found is this.