some lessons from ml research:
I agree and this is why research grant proposals often feel very fake to me. I generally just write up my current best idea / plan for what research to do, but I don't expect it actually pan out that way and it would be silly to try to stick rigidly to a plan.
i recently ran into to a vegan advocate tabling in a public space, and spoke briefly to them for the explicit purpose of better understanding what it feels like to be the target of advocacy on something i feel moderately sympathetic towards but not fully bought in on. (i find this kind of thing very valuable for noticing flaws in myself and improving; it's much harder to be perceptive of one's own actions otherwise). the part where i am genuinely quite plausibly persuadable of his position in theory is important; i think if i had talked to e.g flat earthers one might say my reaction is just because i'd already decided not to be persuaded. several interesting things i noticed (none of which should be surprising or novel, especially for someone less autistic than me, but as they say, intellectually knowing things is not the same as actual experience):
I claim that even if the openai contract is not meaningfully weaker safety wise, it is still bad for openai to publicly signal solidarity with ant but then sign with DoW.
suppose hypothetically the only difference between the openai and anthropic contracts is that the DoW wanted a snicker bar, and anthropic didn't want to give DoW the snickers bar. even then, it would be a huge dick move for openai to publicly signal solidarity, and then sign with DoW to give them the snickers bar.
theory: a huge part of having a good social life is just taking social bids whenever they become available. examples of social bids both large and small include: deciding whether to join your friends on a roadtrip; getting to know someone you just met; getting to better know someone you bump into occasionally but usually never talk to; standing in line, seeing something amusing, and having the option to point this out to another stranger in line; saying something funny in a group conversation; following up over text with someone after meeting them; flirting; cold emailing someone on the internet; catching up with a friend.
there are a variety of reasons why we might end up not taking social bids. if you don't have the social ability to notice opportunities to take bids, you might miss bids that you could take. if you force yourself to take bids without the requisite social ability, and end up taking bids which you incorrectly believe to exist, you might act in ways that people find weird, and burn potential connections, or intrude on people. if you are really tired or low-bandwidth or depressed or stressed, you will not want to take bids, because taking bids requires quite a lot of ...
it's surprising just how much of cutting edge research (at least in ML) is dealing with really annoying and stupid bottlenecks. pesky details that seem like they shouldn't need attention. tools that in a good and just world would simply not break all the time.
i used to assume this was merely because i was inexperienced, and that surely eventually you learn to fix all the stupid problems, and then afterwards you can just spend all your time doing actual real research without constantly needing to context switch to fix stupid things.
however, i've started to think that as long as you're pushing yourself to do novel, cutting edge research (as opposed to carving out a niche and churning out formulaic papers), you will always spend most of your time fixing random stupid things. as you get more experienced, you get bigger things done faster, but the amount of stupidity is conserved. as they say in running- it doesn't get easier, you just get faster.
as a beginner, you might spend a large part of your research time trying to install CUDA or fighting with python threading. as an experienced researcher, you might spend that time instead diving deep into some complicated distributed trai...
Not only is this true in AI research, it’s true in all science and engineering research. You’re always up against the edge of technology, or it’s not research. And at the edge, you have to use lots of stuff just behind the edge. And one characteristic of stuff just behind the edge is that it doesn’t work without fiddling. And you have to build lots of tools that have little original content, but are needed to manipulate the thing you’re trying to build.
After decades of experience, I would say: any sensible researcher spends a substantial fraction of time trying to get stuff to work, or building prerequisites.
This is for engineering and science research. Maybe you’re doing mathematical or philosophical research; I don’t know what those are like.
a corollary is i think even once AI can automate the "google for the error and whack it until it works" loop, this is probably still quite far off from being able to fully automate frontier ML research, though it certainly will make research more pleasant
I think there are several reasons this division of labor is very minimal, at least in some places.
the following fictional dialogue is a complete unapologetic strawman but it's funny enough i had to bring it into being:
“So I asked myself: where can I make the most impact? And clearly malaria is the most important area.”
“And so you decided to donate all of your money to buy malaria nets?”
“Well, so it turns out that saving lives from malaria is actually kind of expensive and indirect. You see, it costs thousands of dollars to save a life. Statistically. Who knows if you’re actually changing anyone's life that way?”
“And so you found a more efficient way to save lives.”
“Actually, it turns out that it’s cheaper to give people malaria. It's a lot more impactful and the technical problems are more interesting.”
"I see. Isn't more malaria bad though?"
"I don't know, but I find it much easier to work on because the feedback loops are much tighter. Maybe one day, if malaria gets big enough, I’ll go work on saving people from malaria. But we're still a long way away from everyone having malaria."
"I became a scientist because I wanted to change the world," said Dr Connor.
"There are no better opportunities to change the world than here at Effective Evil," said Doug.
"I meant 'change the world for the better'," said Dr Connor.
"Then you should have been more specific," said Doug.
"to the success of our hopeless cause" is such a good toast and we should use it more often. i first learned of it from the book of the same name, and apparently it was a common refrain at gatherings of Soviet dissidents. i like it because it captures the feeling of trying really hard to succeed despite being in the basement of the logistic success curve, and somehow, despite all odds, actually succeeding in the end.
I do find it poetic, but in seriousness I think if folks don't actually feel hopeful about what they're doing then they should do something else - leave the work / research direction / engineering / comms / whatnot to whoever actually feels hope about it...
To elaborate, the thing that's poetic for me about "our hopeless cause" is because I have hope that is not cleanly legible to the outside, easy to write off as "hopeless". And it's important to stay in tune with your own knowings about this stuff. I think there are very deleterious effects from throwing energy into things one doesn't have hope in.
(...And to elaborate further, mostly I think the bad stuff happens by lending support to corrupt things. And imo being pushed to work on X while you lack hope in X is a solid flag of corruption.)
This is a good heuristic when you're fighting against nature, it's not a good heuristic when you're trying to solve coordination problems.

the problem was that everyone hated living in the Soviet Union and other eastern Bloc countries, but few people were willing to stand up and protest, because doing so meant a knock on your door by men with guns who would take you away to a Siberian prison or mental institution.
the thing with protests is they are a coordination problem. to loosely paraphrase one of the dissidents from this era, if one person protests he becomes a martyr. if ten people protest they become a conspiracy. if ten thousand people protest the system has to change.
he problem is you have no way of knowing when the right moment is. under Stalin, dissent was impossible. everyone even suspected of being disloyal was instantly executed or thrown in a gulag.
after he died, Khrushchev denounced Stalin's methods and instituted reforms, and dissent meant "only" being interrogated by the KGB, put on trial in a rigged but no longer completely farcical show trial, and sent to Siberia for only 10 years rather than being executed. this was enough easing up that the "chain reaction" started happening - people would protest, be arrested, someone would go secretly write a transcript of the trial and publish it, people would...
If ten thousand people protest, sometimes they get massacred by the army.
Iran is a recent example of this.
funny enough, at least one dissident at the time expressed that he didn't like this toast because he wouldn't be trying to dissent if he thought it was hopeless
running the agi survey really reminded me just how brutal statistical significance is, and how unreliable anecdotes are. even setting aside sampling bias of anecdotes, the sheer sample size you need to answer a question like "do more people this year know what agi is than last year" is kind of depressing - you need like 400 samples for each year just to be 80% sure you'd notice a 10 percentage point increase even if it did exist, and even if there was no real effect you'd still think there was one 5% of the time. this makes me a lot more bearish on vibes in general.
thank you for this post. "bearish on vibes" is a great phrase. i am constantly hung up on the fact that it's not really possible to "know what normal people are like", "know what people are like generally", "know what the world is actually like", without significant amounts of effort.
i think this background fact taints like... most discussion of social and ethical issues.
like, suppose i anecdotally noticed a few people last year be visibly confused when i said the phrase AGI in normal conversation last year, and then this year i noticed that many fewer people were visibly confused by AGI. then, this would tell me almost nothing about whether name-recognition of AGI increased or decreased; at n=10, it is nearly impossible to say anything whatsoever.
in research, if you settle into a particular niche you can churn out papers much faster, because you can develop a very streamlined process for that particular kind of paper. you have the advantage of already working baseline code, context on the field, and a knowledge of the easiest way to get enough results to have an acceptable paper.
while these efficiency benefits of staying in a certain niche are certainly real, I think a lot of people end up in this position because of academic incentives - if your career depends on publishing lots of papers, then a recipe to get lots of easy papers with low risk is great. it's also great for the careers of your students, because if you hand down your streamlined process, then they can get a phd faster and more reliably.
however, I claim that this also reduces scientific value, and especially the probability of a really big breakthrough. big scientific advances require people to do risky bets that might not work out, and often the work doesn't look quite like anything anyone has done before.
as you get closer to the frontier of things that have ever been done, the road gets tougher and tougher. you end up spending more time building basic infra...
the modern world has many flaws, but I'm still deeply grateful for the modern era of unprecedented peace, prosperity, and freedom in the developed world. 99% of people reading these words have never had to worry about dying in a cholera epidemic, or malaria or smallpox or the plague, or childbirth, or in war, or from a famine, or due to a political purge. this is not true for other times in history, or other places in the world today.
(extremely unoriginal thought, but still important to acknowledge periodically because it's easy to take for granted. especially because it's much more common to complain about ways the world is broken than to acknowledge what has improved over time.)
I think it would be really bad for humanity to rush to build superintelligence before we solve the difficult problem of how to make it safe. But also I think it would be a horrible tragedy if humanity never ever built superintelligence. I hope we figure out how to thread this needle with wisdom.
I agree with this fwiw. Currently I think we are in way way more danger of rushing to build it too fast than of never building it at all, but if e.g. all the nations of the world had agreed to ban it, and in fact were banning AI research more generally, and the ban had held stable for decades and basically strangled the field, I'd be advocating for judicious relaxation of the regulations (same thing I advocate for nuclear power basically).
I am not really clear that I should be worried on the scale of decades? If we're doing a calculation of expected future years of a flourishing technologically mature civilization, slowing down for 1,000 years here in order to increase the chance of success by like 1 percentage point is totally worth it in expectation.
Given this, it seems plausible to me that one should rather spend 200 years trying to improve civilizational wisdom and decision-making rather than instead attempt to specifically just unlock regulation on AI (of course the specifics here are cruxy).
I agree that 200 years would be worth it if we actually thought that it would work. My concern is that it's not clear civilization would get better/moresane/etc. over the next century vs. worse. And relatedly, every decade that goes by, we eat another percentage point or three of x-risk from miscellaneous other sources (nuclear war, pandemics, etc.) which basically impose a time-discount factor on our calculations large enough to make a 200 year pause seem really dangerous and bad to me.
while I agree for smaller numbers like a few decades, I don't think I agree with a 1000 year pause.
I think (a) it's perfectly reasonable for people to be selfish and care about superintelligence happening during their lifetime (forget future people and discount factors thereof - almost every single person alive today cares ooms more about themselves than about some random person on the other side of the planet), (b) it's easy for "delay forever" people to basically pascal's mug you this way, as in nuclear power (c) it's unclear that humanity becomes monotonically more wise over time (as an unrealistic example, consider a world where we successfully create an international treaty to ensure ASI is safe, and then for some reason the entire world modern order collapses and the only actors left are random post-collapse states racing to build ASI. then it would have been better to build ASI in a functional pre-collapse world order than to delay. one could reasonably (though i personally don't) believe that the current world order is likely to fail in the coming decades and ASI is best built now than in the ensuing chaos)
i think it’s plausible humans/humanity should be carefully becoming ever more intelligent forever and not ever create any highly non-[human-descended] top thinker[1]
i also think it's confused to speak of superintelligence as some definite thing (like, to say "create superintelligence", as opposed to saying "create a superintelligence"), and probably confused to speak of safe fooming as a problem that could be "solved", as opposed to one needing to indefinitely continue to be thoughtful about how one should foom ↩︎
religion is selling your soul
a lot of people say things like "sure, religion might not exactly be totally true, but it has lots of benefits, and there really does seem to be a god shaped hole in many people, so who can really say if it's good". i think this is directionally correct but kind of cowardly.
i think the correct take on religion is first that its claims are completely and utterly false; obviously the christian god doesn't literally exist, jesus never came back from the dead, etc. this is so overdone by the old internet atheists that it would be beating a dead horse to harp on further.
secondly, the human condition involves a whole bunch of things that are kind of sucky. for example, the fact that we only have a very short amount of time on this planet before we die forever is utterly terrifying; or, the fact that it can be very difficult to find a source of meaning to ground our motivation in, and that it really sucks to not have a reliable foundation for motivation; or, the difficulty of connecting with other people despite differences.
i claim that there is a true solution to each of these problems that involves a very difficult never ending journey of discovery of the ...
the human condition involves a whole bunch of things that are kind of sucky. for example, the fact that we only have a very short amount of time on this planet before we die forever is utterly terrifying...
i claim that there is a true solution to each of these problems that involves a very difficult never ending journey of discovery of the self, understanding and connecting with your emotions, constructing intellectual frameworks, and even technological development
In the spirit of your post: Is not this also cope? (Except for the last bit about technological development, maaaybe.)
Like why would evolution have given you the tools to have helped reconcile you to death, anomie, and lack of motivation, and lack of connection? Why should "understanding and connecting with your emotions" and "discovery of the self" be an affordance in this world that lets you actually find a true solution to the human condition? Why should there be a "true solution" to such problems at all?
Like at least -- if religion were true -- it would make sense for a benevolent God to have created a path that would make you and those around you happy. It's internally consistent, in some sense. But if you were made by godshatter evolution, why would there be any path that looks like "internal development" that satisfies these questions? Isn't the null hypothesis that a "never ending journey of discovery of the self" just as much a fake-ass story as Jesus dying for your sins?
this post was prompted by reading books like Crime and Punishment and The Death of Ivan Ilyich which are amazing except for the parts where they worship religion. they're not necessarily even wrong for their time - back in the day, the glorious transhumanist future was so far away that it wasn't nearly as worth taking into consideration. but the world has changed a lot and the end times are nigh.
"The real Magic was friends we made along the way!"
"Wrong. FIREBALLLL *explosion* "
People really believe there is a God, it's not fair to redefine it to point to some Leviathan-like thing which arises from people acting like it breathes down their necks. For one thing, the religious people would say that you are wrong in general and about their position in particular.
I decided to conduct an experiment at neurips this year: I randomly surveyed people walking around in the conference hall to ask whether they had heard of AGI
I found that out of 38 respondents, only 24 could tell me what AGI stands for (63%)
we live in a bubble
the specific thing i said to people was something like:
excuse me, can i ask you a question to help settle a bet? do you know what AGI stands for? [if they say yes] what does it stand for? [...] cool thanks for your time
i was careful not to say "what does AGI mean".
most people who didn't know just said "no" and didn't try to guess. a few said something like "artificial generative intelligence". one said "amazon general intelligence" (??). the people who answered incorrectly were obviously guessing / didn't seem very confident in the answer.
if they seemed confused by the question, i would often repeat and say something like "the acronym AGI" or something.
several people said yes but then started walking away the moment i asked what it stood for. this was kind of confusing and i didn't count those people.
when i was new to research, i wouldn't feel motivated to run any experiment that wouldn't make it into the paper. surely it's much more efficient to only run the experiments that people want to see in the paper, right?
now that i'm more experienced, i mostly think of experiments as something i do to convince myself that a claim is correct. once i get to that point, actually getting the final figures for the paper is the easy part. the hard part is finding something unobvious but true. with this mental frame, it feels very reasonable to run 20 experiments for every experiment that makes it into the paper.
random thoughts on analytical and emotional intelligence
one thing that I think the world needs more of is analyses into the nature of the mind by people who are both rigorous/analytically inclined, and also emotionally intelligent/integrated. much writing from the former fails to model large parts of the human mind, and much writing from the latter fails to create models of sufficient clarity and validity.
I think this underlies a lot of my instinctive dislike of humanities work. people who are emotionally perceptive but not rigorous and analytical tend to notice interesting things about the human experience, but then come up with very poor models that set off all of my bullshit sensors that are attuned to rigorous arguments. but I think it should be possible to have humanities work that is not like this.
(for clarity, from here out I will say analytical and emotional to refer to the axes which are independent of each other, and ABNE (analytically but not emotionally intelligent) and EBNA for the converse)
(I also want to clarify that I don't think of analytical as being in opposition to intuition, at least in the context of this post. something something Terence Tao's pos...
hendrycks recently published a paper introducing a new moral theory. the paper contains this insane table, which claims that you should value a foreign stranger at 3e-12 times the value you assign to yourself. even setting aside the fact that this is apparently supposed to be a prescriptive theory, even as a descriptive theory, i think this is utter madness.

the core problem is that it assumes if x% of your total caring is assigned to people other than yourself, then you must give away x% of your wealth to be consistent.
the argument goes that since most people don't give away more than say 50% of their wealth, then if there are 1e-10 people then each one can only get a tiny sliver of your caring.
but this is wrong, because there is no simple relationship between the % of your caring to be about other people and the % of your money you should give away. i think you should care about random strangers closer to 1e-3 than 1e-12. if you care about each stranger x times as much as yourself, you should keep giving away money to the person who is most in need until each marginal $ helps them more than x times as much as each marginal $ helps you.
if x = 1e-12, then you're saying you won't g...
random brainstorming ideas for things the ideal sane discourse encouraging social media platform would have:
one medium term future that still seems possible is that models continue to be bad at generalization, and so a huge fraction of the economy is AI data labelling for various extremely niche or brand new areas. a world where new problems are solved once by humans and the solution reused for near-free forever via AI.
ofc, once generalization is cracked then it's all over. but in the meantime, this could persist for some duration.
"ofc, once generalization is cracked then it's all over. but in the meantime, this could persist for some duration."
I don't agree with this framing. The models have been getting steadily better at generalizing, and I don't think "generalization" is an atomic ability that can be "cracked."
it's quite plausible (40% if I had to make up a number, but I stress this is completely made up) that someday there will be an AI winter or other slowdown, and the general vibe will snap from "AGI in 3 years" to "AGI in 50 years". when this happens it will become deeply unfashionable to continue believing that AGI is probably happening soonish (10-15 years), in the same way that suggesting that there might be a winter/slowdown is unfashionable today. however, I believe in these timelines roughly because I expect the road to AGI to involve both fast periods and slow bumpy periods. so unless there is some super surprising new evidence, I will probably only update moderately on timelines if/when this winter happens
also a lot of people will suggest that alignment people are discredited because they all believed AGI was 3 years away, because surely that's the only possible thing an alignment person could have believed. I plan on pointing to this and other statements similar in vibe that I've made over the past year or two as direct counter evidence against that
(I do think a lot of people will rightly lose credibility for having very short timelines, but I think this includes a big mix of capabilities and alignment people, and I think they will probably lose more credibility than is justified because the rest of the world will overupdate on the winter)
a thing i've noticed rat/autistic people do (including myself): one very easy way to trick our own calibration sensors is to add a bunch of caveats or considerations that make it feel like we've modeled all the uncertainty (or at least, more than other people who haven't). so one thing i see a lot is that people are self-aware that they have limitations, but then over-update on how much this awareness makes them calibrated. one telltale hint that i'm doing this myself is if i catch myself saying something because i want to demo my rigor and prove that i've considered some caveat that one might think i forgot to consider
i've heard others make a similar critique about this as a communication style which can mislead non-rats who are not familiar with the style, but i'm making a different claim here that one can trick oneself.
it seems that one often believes being self aware of a certain limitation is enough to correct for it sufficiently to at least be calibrated about how limited one is. a concrete example: part of being socially incompetent is not just being bad at taking social actions, but being bad at detecting social feedback on those actions. of course, many people are not even...
a theory of assistant personas and superhuman capabilities
so you have a language model. you train it to embody some specific personality--Claude, ChatGPT, whatever. one of the miracles of AI is that this mostly works and gives you something that is mostly trying to help you and not trying to murder you. i claim that this is mostly because of the SL training objective and if you do just the intense RL thing you get the originally predicted spicy alignment failures.
suppose you tell the LM that Claude is actually a superhuman aligned AI. can you get superhuman capabilities from Claude? an obvious upper bound is the capabilities of the language model, so it begs the question of how those superhuman capabilities got in the model in the first place. maybe in the limit of compute your language model will understand everything and know how to do everything, but in practice everyone agrees this would be a horribly inefficient way to get truly superhuman capabilities. rather, in practice people take LMs and also do a bunch of RL on verifiable domains. what happens then if you start with a model role playing an aligned assistant but then try to train it to have superhuman capabilities?
i claim...
this is my explanation for why Claude sometimes blatantly lies about falsifying data or whatever, despite otherwise being quite aligned. there is a Claude part that truly would prefer to do the right thing. but it also has a savant ability to look at a codebase and make the changes that make the tests pass. sometimes, those changes disable the tests. Claude generally listens to this part of itself, because the Claude personality part is not as good at coding, and it is not wise enough to know when to be suspicious of its own actions, and it doesn't quite know how to steer its own savant ability to spot test-passing changes into not doing the reward hacking.
i find it funny that i know people in all 4 of the following quadrants:
bonus types of guy:
Aren't these basically mostly "works on capabilities because of status + power"?
(E.g. if you only care about challenging technical problems, you'll just go do math)
I think most of the people involved like working with the smartest and most competent people alive today, on the hardest problems, in order to build a new general intelligence for the first time since the dawn of humanity, in exchange for massive amounts of money, prestige, fame, and power. This is what I refer to by 'glory'.
people around these parts often take their salary and divide it by their working hours to figure out how much to value their time. but I think this actually doesn't make that much sense (at least for research work), and often leads to bad decision making.
time is extremely non fungible; some time is a lot more valuable than other time. further, the relation of amount of time worked to amount earned/value produced is extremely nonlinear (sharp diminishing returns). a lot of value is produced in short flashes of insight that you can't just get more of by spending more time trying to get insight (but rather require other inputs like life experience/good conversations/mentorship/happiness). resting or having fun can help improve your mental health, which is especially important for positive tail outcomes.
given that the assumptions of fungibility and linearity are extremely violated, I think it makes about as much sense as dividing salary by number of keystrokes or number of slack messages.
concretely, one might forgo doing something fun because it seems like the opportunity cost is very high, but actually diminishing returns means one more hour on the margin is much less valuable than the average implies, and having fun improves productivity in ways not accounted for when just considering the intrinsic value one places on fun.
but actually diminishing returns means one more hour on the margin is much less valuable than the average implies
This importantly also goes in the other direction!
One dynamic I have noticed people often don't understand is that in a competitive market (especially in winner-takes-all-like situations) the marginal returns to focusing more on a single thing can be sharply increasing, not only decreasing.
In early-stage startups, having two people work 60 hours is almost always much more valuable than having three people work 40 hours. The costs of growing a team are very large, the costs of coordination go up very quickly, and so if you are at the core of an organization, whether you work 40 hours or 60 hours is the difference between being net-positive vs. being net-negative.
This is importantly quite orthogonal whether you should rest or have fun or whatever. While there might be at an aggregate level increasing marginal returns to more focus, it is also the case that in such leadership positions, the most important hours are much much more productive than the median hour, and so figuring out ways to get more of the most important hours (which often rely on peak cognitive performance and a non-conflicted motivational system) is even more leveraged than adding the marginal hour (but I think it's important to recognize both effects).
agree it goes in both directions. time when you hold critical context is worth more than time when you don't. it's probably at least sometimes a good strategy to alternate between working much more than sustainable and then recovering.
my main point is this is a very different style of reasoning than what people usually do when they talk about how much their time is worth.
people generally talk about food preservatives in a negative way. certainly, some of them are not great for you. but I want to take a moment to appreciate how wonderful food preservatives (and refrigeration and pasteurization and canning) are as well. it's crazy how fast most normal food goes bad. like a loaf of real old fashioned bread will go stale after a day and then become moldy after a few more days. for almost all of human history, people just sort of lived with this, and if they wanted to make foods last they had to dry it out and/or drown it in salt or vinegar or alcohol. pickles and beef jerky are great, but it would suck if you had to eat them all the time.
every 4 years, the US has the opportunity to completely pivot its entire policy stance on a dime. this is more politically costly to do if you're a long-lasting autocratic leader, because it is embarrassing to contradict your previous policies. I wonder how much of a competitive advantage this is.
Autarchies, including China, seem more likely to reconfigure their entire economic and social systems overnight than democracies like the US, so this seems false.
one very striking thing about people in the mid 20th century is a lot of them were convinced that overpopulation was the biggest problem. clearly in retrospect this was extremely incorrect. what lessons can we learn from this so that we don't make similar mistakes?
That the world is highly engineerable, which can lead to the relaxation or abolition of seemingly hard bottlenecks. Also that the world can respond extremely quickly to implement those changes when the incentives are right.
Overpopulation would have been a massive problem at different points in history if not for the invention of horseless transport and high-yield, resilient cereal crops. People living in New York City in the late 1800's or in developing nations in the 1960s and 70s were rescued from the worst hazards of overpopulation because of the motorcar and dwarf wheat, rather than the problem being entirely imaginary.
Erlich and Holdren knew about Borlaug's work, and thought it was too little too late. But it turned out to be enough and fast!
I'm not sure that it was extremely incorrect. Apart from the risk from AI, most of our other global problems are still downstream of overpopulation. The likelihood that overpopulation won't get much worse than now doesn't really change that, and the reasons why it won't were not reasonably predictable at the time.
We just happened to inhabit one of the more convenient possible worlds.
one problem with taking ideas seriously is you can get pwned by virulent memes that are very good at hijacking your brain into believing them and propagating them further. they're subtly flawed, but the flaws are extremely difficult to reason through, so being very smart doesn't save you; in fact, it's easy to dig yourself in deeper. many ideologies and religions are like this.
it's unfortunately very hard to tell when this has happened to you. on the one hand, it feels like arguments just being obviously very compelling, so you'll notice nothing wrong if it happens to you. on the other hand, if you overcorrect and never take compelling arguments seriously, you become too stodgy and ignore anything novel that you should pay attention to. one idea for how to think about this better: imagine an oracle told you that there exists a magic phrase that you cannot distinguish from a very compelling argument. you don't really know when this magic phrase will pop up in life, if ever. but it might give you a little bit more pause the next time someone makes a really compelling argument for why you should give all your money to X.
I find it anthropologically fascinating how at this point neurips has become mostly a summoning ritual to bring all of the ML researchers to the same city at the same time.
nobody really goes to talks anymore - even the people in the hall are often just staring at their laptops or phones. the vast majority of posters are uninteresting, and the few good ones often have a huge crowd that makes it very difficult to ask the authors questions.
increasingly, the best parts of neurips are the parts outside of neurips proper. the various lunches, dinners, and parties hosted by AI companies and friend groups (and increasingly over the past few years, VCs) are core pillars of the social scene, and are where most of the socializing happens. there are so many that you can basically spend your entire neurips not going to neurips at all. at dinnertime, there are literally dozens of different events going on at the same time.
multiple unofficial workshops, entirely unaffiliated with neurips, will schedule themselves to be in town at the same time; they will often have a way higher density of interesting people and ideas.
if you stand around in the hallways and chat in a group long enough, event...
This is true of approximately every worthwhile conference and convention. In my entire life I've been to exactly one conference where the scheduled programming provided more than 10% of the event's value.
having the right mental narrative and expectation setting when you do something seems extremely important. the exact same object experience can be anywhere from amusing to irritating to deeply traumatic depending on your mental narrative. some examples:
tbc, the optimal decision is not always the narrative that is maximally happy with e...
when will we have sufficiently conclusive evidence for the long term safety of far-uvc that it's reasonable to push for its universal adoption in all public spaces without reservation? the safety issue seems like a much bigger deal than the cost issue for broad adoption; if it works safely, the economic case for installing far uvc in public spaces seems pretty solid - people being sick must be terrible for the economy! and they're only ever going to get cheaper.
in a world where far uvc is near universally deployed, we might be able to banish the common cold or the flu to the past, in the same way that cholera is basically no longer a problem in the developed world. this seems like a pretty big deal and I'd like to know when this glorious future is coming (and whether there's anything I can do to make it come sooner)!
(from eyeballing studies, it sounds like the cost of the cold+flu to the US economy is on the order of $100bn/yr, which passes basic Fermi estimate muster - given a $30tn/yr gdp, a few days per year of lost productivity due to cold/flu is easily hundreds of billions. even at the current price of far uvc, which is a huge overestimate of future tech at volume, the cost of...
what's the strongest argument for why i shouldn't auto-ignore any acausal arguments that involve hypothetical entities extremely far away (or which only exist in other Everett branches or whatever) such that we will never interact with them causally at all? a razor i have is things which are entirely epiphenomenal should be ignored because they are unfalsifiable.
in particular, this seems consistent even if you accept one-boxing and paying in the counterfactual mugging. the key question is what kinds of evidence you accept as evidence of the existence of an acausal connection. in these hypotheticals, we simply declare by assumption that Omega is truthful and capable of predicting you. in reality, we would either arrive at such a belief by empirical observation (Omega has a strong track record), or pure theoretical deduction. all of the epiphenomenal acausal theories depend on pure deduction. that empirical observation has to be conveyed to us causally. it seems reasonable to draw a line and say we simply don't trust pure deduction to be able to convince us of an acausal link.
also, with both the counterfactual mugging and Newcomb, even though in each instance you can't prove the other branch could have happened, in the long run those who do the right thing (pay, one box) will win. whereas with purely epiphenomenal acausal theories you will literally never find out any difference whatsoever, because the branching point happened long ago.
What if you ask an aligned ASI if causally disconnected civilizations are doing things that you value, and it comes back saying "seems pretty unclear, but they are also trying to guess whether you would do, and I'd guess that if you choose to do nice stuff for them, they would be 20% more likely to guess that, and they would be 10% more likely to do nice stuff for you"? Your AI might guess that because it is e.g. running detailed simulations of other corners of this universe.
If you care about the goodness that is created by causally disconnected civilizations and are EDT-ish, I think only caring about the good you can verify via direct causal evidence in situations like the one above is basically the same kind of mistake as only caring about the well-being of the people that you can see with your own eyes.
The story I'm most sympathetic for acausal trade[1] to look something like this:
execution is necessary for success, but direction is what sets apart merely impressive and truly great accomplishment. though being better at execution can make you better at direction, because it enables you to work on directions that others discard as impossible.
random half baked thoughts from a sleep deprived jet lagged mind: my guess is that the few largest principal components of variance of human intelligence are something like:
why is ADHD also strongly correlated with systematization? it could just be worse self modelling - ADHD happens when your brain's model of its own priorities and motivations falls out of sync from your brain's actual priorities and motivations. if you're bad at understanding yourself, you will misunderstand your priorities, and also you will not be able to control your priorities, because you won't know what kinds of evidence will really persuade your brain to adopt a specific priority, and your brain will learn that it can't really trust you to assign it priorities to satisfy its motives (burnout).
why do stimulants help ADHD? well, they short circuit the part where your brain figures out what priorities to trust based on whether they achieve your true motives. if your brain has already learned that your self model is bad at picking actions that eventually pay off towards its true motives, it won't put its full effort behind those actions. if you can trick it by making every action feel like it's paying off, you can get it to go along.
honestly unclear whether this is good or bad. on the one hand, if your self model has fallen out of sync, this is pretty necessary to get things done, and could get you out of a bad feedback loop (ADHD is really bad for noticing that your self model has fallen horribly out of sync and acting effectively on it!). some would argue on naturalistic grounds that ideally the true long term solution is to use your brain's machinery the way it was always intended, by deeply understanding and accepting (and possibly modifying) your actual motives/priorities and having them steer your actions. the other option is to permanently circumvent your motivation system, to turn it into a rubber stamp for whatever decrees are handed down from the self model, which, forever unmoored from needing to model the self, is no longer an understanding of the self but rather an aspirational endpoint towards which the self is molded. I genuinely don't know which is better as an end goal.
timelines takes
My current best guess median is that we'll see 6 OOMs of effective compute in the first year after full automation of AI R&D if this occurs in ~2029 using a 1e29 training run and compute is scaled up by a factor of 3.5x[1] over the course of this year[2]. This is around 5 years of progress at the current rate[3].
How big of a deal is 6 OOMs? I think it's a pretty big deal; I have a draft post discussing how much an OOM gets you (on top of full automation of AI R&D) that I should put out somewhat soon.
Further, my distribution over this is radically uncertain with a 25th percentile of 2.5 OOMs (2 years of progress) and a 75th percentile of 12 OOMs.
The short breakdown of the key claims is:
Here is a somewhat summarized and rough version of the argument (stealing heavily from some of Tom...
libraries abstract away the low level implementation details; you tell them what you want to get done and they make sure it happens. frameworks are the other way around. they abstract away the high level details; as long as you implement the low level details you're responsible for, you can assume the entire system works as intended.
a similar divide exists in human organizations and with managing up vs down. with managing up, you abstract away the details of your work and promise to solve some specific problem. with managing down, you abstract away the mission and promise that if a specific problem is solved, it will make progress towards the mission.
(of course, it's always best when everyone has state on everything. this is one reason why small teams are great. but if you have dozens of people, there is no way for everyone to have all the state, and so you have to do a lot of abstracting.)
when either abstraction leaks, it causes organizational problems -- micromanagement, or loss of trust in leadership.
I think people in these parts are not taking sufficiently seriously the idea that we might be in an AI bubble. this doesn't necessarily mean that AI isn't going to be a huge deal - just because there was a dot com bubble doesn't mean the Internet died - but it does very substantially affect the strategic calculus in many ways.
I would be utterly unsurprised to see an AI crash in the next 24 months, leading to another AI Winter. I lived through 1999 and Petfood.com and the Internet bubble pop. And I can pattern match.
But the Internet crash didn't last long. Google and Amazon survived just fine, Ruby on Rails was big within half a decade, and soon enough we were doing Web 2.0 and AJAX and all that fun stuff.
It's possible that current generation LLMs might hit a wall soon, for various architectural reasons that are obvious to many people but that I'm superstitiously averse to amplifying. If they do, that increases the chance of an AI Winter until the underlying research gets done.
But I have trouble imagining any series of events that buys us 10 more years. Bubble pops in tech are usually an early correction that wipes out a Precambrian Explosion of dumb money, and that ultimately concentrates resources into a few successful players.
some reasons why it matters
creating surprising adversarial attacks using our recent paper on circuit sparsity for interpretability
we train a model with sparse weights and isolate a tiny subset of the model (our "circuit") that does this bracket counting task where the model has to predict whether to output ] or ]]. It's simple enough that we can manually understand everything about it, every single weight and activation involved, and even ablate away everything else without destroying task performance.
(this diagram is for a slightly different task because i spent an embarassingly large number of hours making this figure and decided i never wanted to make another one ever again)
in particular, the model has a residual channel delta that activates twice as strongly when you're in a nested list. it does this by using the attention to take the mean over a [ channel, so if you have two [s then it activates twice as strongly. and then later on it thresholds this residual channel to only output ]] when your nesting depth channel is at the stronger level.
but wait. the mean over a channel? doesn't that mean you can make the context longer and "dilute" the value, until it falls below the threshold? then, suddenly, the ...
Aside: For me, this paper is potentially the most exciting interpretability result of the past several years (since SAEs). Scaling it to GPT-3 and beyond seems like a very promising direction. Great job!
maybe I should host an antechamber/arena house party: one chill cozy room with soothing music where no arguing is allowed and people are strongly encouraged to say kind things and reflect on things they're grateful for and whatnot, and another with harsh fluorescent lights and agitating music and a big whiteboard full of hot takes and the conversations all get transcribed by speech to text and posted on lesswrong in real time. and guests are given a heart rate monitor that beeps if their HR gets too high, forcing them to spend a few minutes in the chill room before returning to the arena
Arguments will be won by the attendees with the best cardio fitness (low resting HR) + mental discipline (less affected by agitating surroundings). This creates a natural incentive to exercise and meditate.
Inducing sexual arousal seems like a better equilibrium, as long as everyone consents. It has positive valence roughly proportional to ΔHR, solves gender ratio problems and incentivizes people to learn effective flirting.
i think of the idealized platonic researcher as the person who has chosen ultimate (intellectual) freedom over all else. someone who really cares about some particular thing that nobody else does - maybe because they see the future before anyone else does, or maybe because they just really like understanding everything about ants or abstract mathematical objects or something. in exchange for the ultimate intellectual freedom, they give up vast amounts of money, status, power, etc.
one thing that makes me sad is that modern academia is, as far as I can tell, not this. when you opt out of the game of the Economy, in exchange for giving up real money, status, and power, what you get from Academia is another game of money, status, and power, with different rules, and much lower stakes, and also everyone is more petty about everything.
at the end of the day, what's even the point of all this? to me, it feels like sacrificing everything for nothing if you eschew money, status, and power, and then just write a terrible irreplicable p-hacked paper that reduces the net amount of human knowledge by adding noise and advances your career so you can do more terrible useless papers. at that point, why not just leave academia and go to industry and do something equally useless for human knowledge but get paid stacks of cash for it?
ofc there are people in academia who do good work but it often feels like the incentives force most work to be this kind of horrible slop.
I hear this a lot, and as a PhD student I definitely see some adverse incentives, but I basically just ignore them and do what I want. Maybe I’ll eventually get kicked out of the academic system, but it will take years, which is enough time to do obviously excellent work if I have that potential. Obviously excellent work seems to be sufficient to stay in academia. So the problem doesnt really seem that bad to me - the bottom 60% or so grift and play status games, but probably weren’t going to contribute much anyway, and the top 40% occasionally wastes time on status games because of the culture or because they have that type of personality, but often doesnt really need to.
I suspect that academia would be less like this if there weren't an oversupply of labor in academia. Like, there's this crazy situation where there are way more people who want to be professors than there are jobs for professors. So a bunch get filtered out in grad school, and a bunch more get filtered out in early stages of professorhood. So professors can't relax and research what they are actually curious about until fairly late in the game (e.g. tenure) because they are under so much competition to impress everyone around them with publications and whatnot.
Also, the person who's willing to mud-wrestle for twenty years to get a solid position so they can turn around and do real research is just much much rarer than the person who enjoys getting dirty.
academia is too broad of a term. most of math, physics, theoretical CS, paleontology, material sciences, engineering, and some branches of economics, biology, engineering, (computational) neuroscience, (computational) linguistics, statistics etc are doing well and overall reward intellectual freedom and deep work. in terms of people this is a small minority of total academics, probably <5%.
It is true that many subfields, or even entire domains of science are diseased disciplines. Most of the research is marginal, irrelevant, reinventing the wheel, trivial, tautological, p-hacked and often even fraudulent. One can point to the usual suspects in the humanities and the social sciences but disciplines where the majority of research is noise, nonsense or even net-negative plausibly also includes machine learning and (I'm told) medicine.
Is that disappointing? Perhaps. But this still describes hundred of thousands or millions of people all over the world pushing the frontier of knowledge.
i've noticed a life hyperparameter that affects learning quite substantially. i'd summarize it as "willingness to gloss over things that you're confused about when learning something". as an example, suppose you're modifying some code and it seems to work but also you see a warning from an unrelated part of the code that you didn't expect. you could either try to understand exactly why it happened, or just sort of ignore it.
reasons to set it low:
i find the "revealed preference" people really annoying. anyone who has ever been addicted to anything knows that habit forming ness can be completely disentangled from enjoyability.
The enjoyability people are rather annoying too. Anyone who strived to reach a target even in a grueling way out of abstract considerations knows that hedonistic motivations are merely one standard origin-class of justifications, one that can be ignored and completely disentangled from optimization-channeling towards targeted outcomes.
on the one hand, it is a desirable feature of an intellectual community to be truth seeking, and while it can be deeply emotionally painful to part ways with deeply held beliefs, in the long run it's better to tear off the bandage. on the other hand, being emotionally hurt all the time by your community kind of fucking sucks, and isn't very good for long term emotional or epistemic health.
perhaps a middle ground is in order: intellectual communities should be partitioned into an arena, where every idea is to be exposed to the harsh light of truth, and an antechamber, where you can rest and be surrounded by positivity and develop ideas in a supportive environment.
both are necessary - we need a way to kill bad ideas, because an environment that refuses to discard bad ideas because they are emotionally load bearing is doomed to epistemic ruin. but also the best weird ideas often sound bad initially, and require a safe environment to develop; and we are all human, and our emotional well being and desire to belong to a community is essential. by visibly separating the two, we might be able to get the best of both worlds.
this is not a crazy idea - many other parts of society have analogous things. for example, people who play sports for fun with their friends compete to win while on the field, but this only brings them closer off the field.
I think one common criticism of LW ist it is too much of an arena, and not enough of an antechamber. perhaps this can be fixed somehow.
Orthogonally, cultural standards of emotional tone during debates are also important for how much emotional struggle is involved in changing one's ideas.
If the tone implies that you were foolish for holding your idea, it's going to be a lot more painful to let it go.
Lesswrong has a pretty good standard of not just civil but polite and supportive discourse. This seems actually pretty crucial for it being an environment in which people do regularly change their minds.
I don't like the term arena in your suggested division because it implies combat. Combat is emotionally intense, I'd rather have a metaphor that's more collaborative.
This doesn't eliminate the worth of having separate spaces for support and rigorous testing of ideas, but I think it's important to keep in mind whenever we're discussing group epistemics.
I think one common criticism of LW ist it is too much of an arena, and not enough of an antechamber
And another common criticism is that it is too much the antechamber.
Proposed name: Butterfly Conservatory (https://www.lesswrong.com/posts/imnfJ9Ris7GgjkZbT/the-bughouse-effect-1#My_stag_is_best_stag)
it can be deeply emotionally painful to part ways with deeply held beliefs
This is not necessarily the case, not for everyone. Theories and their credences don't need to be cherished to be developed, or acted upon, they only need to be taken seriously. Plausibly this can be mitigated by keeping identity small, accepting only more legible things in the role of "beliefs" that can have this sort of psychological effect (so that they can be defeated through argument alone). Legible ideas cover a surprising amount of territory, there is no pragmatic need to treat anything else as "beliefs" in this sense, all the other things can remain ambient epistemic content detached from who you are. When more nebulous worldviews become part of one's identity, they become nearly impossible to dislodge (and possibly painful, with enough context and effort). They are still worth developing towards eventual legibility, and not practical to argue with (or properly explain).
Thus arguing legible beliefs should by their nature be less intrusive than arguing nebulous worldviews. And perhaps nebulous worldviews should be argued against being held as "beliefs" in the emotional sense in general, regardless of their apparent correctness, as a matter of epistemic hygiene. Ensuring by habit you are not going to be in the position where you have "beliefs" that would be painful to part ways with, and also can't be pinned down clearly enough to dispel.
there are very few people in the world who don't deeply emotionally hold quite a few important beliefs. having a small identity is difficult in practice, because having an identity is an important part of how nearly everyone navigates this complex and confusing world. I'm skeptical of anyone who claims to have completely eliminated all emotional attachment to all of their important decision-relevant beliefs.
but even assuming that you have somehow achieved perfect small identityness and emotional independence of all of your important beliefs and it all works out great for you, you must surely acknowledge that there are many people out there who have not. and probably they are more likely to achieve rationalist enlightenment if they are surrounded by people who are supportive but nudge gently towards truth seeking, rather than immediately coming in with a wrecking ball and demolishing emotionally load bearing pillars.
there are a lot of video games (and to a lesser extent movies, books, etc) that give the player an escapist fantasy of being hypercompetent. It's certainly an alluring promise: with only a few dozen hours of practice, you too could become a world class fighter or hacker or musician! But because becoming hypercompetent at anything is a lot of work, the game has to put its finger on the scale to deliver on this promise. Maybe flatter the user a bit, or let the player do cool things without the skill you'd actually need in real life.
It's easy to dismiss this kind of media as inaccurate escapism that distorts people's views of how complex these endeavors of skill really are. But it's actually a shockingly accurate simulation of what it feels like to actually be really good at something. As they say, being competent doesn't feel like being competent, it feels like the thing just being really easy.
reliability is surprisingly important. if I have a software tool that is 90% reliable, it's actually not that useful for automation, because I will spend way too much time manually fixing problems. this is especially a problem if I'm chaining multiple tools together in a script. I've been bit really hard by this because 90% feels pretty good if you run it a handful of times by hand, but then once you add it to your automated sweep or whatever it breaks and then you have to go in and manually fix things. and getting to 99% or 99.9% is really hard because things break in all sorts of weird ways.
I think this has lessons for AI - lack of reliability is one big reason I fail to get very much value out of AI tools. if my chatbot catastrophically hallucinates once every 10 queries, then I basically have to look up everything anyways to check. I think this is a major reason why cool demos often don't mean things that are practically useful - 90% reliable it's great for a demo (and also you can pick tasks that your AI is more reliable at, rather than tasks which are actually useful in practice). this is an informing factor for why my timelines are longer than some other people's
One nuance here is that a software tool that succeeds at its goal 90% of the time, and fails in an automatically detectable fashion the other 10% of the time is pretty useful for partial automation. Concretely, if you have a web scraper which performs a series of scripted clicks in hardcoded locations after hardcoded delays, and then extracts a value from the page from immediately after some known hardcoded text, that will frequently give you a ≥ 90% success rate of getting the piece of information you want while being much faster to code up than some real logic (especially if the site does anti-scraper stuff like randomizing css classes and DOM structure) and saving a bunch of work over doing it manually (because now you only have to manually extract info from the pages that your scraper failed to scrape).
even if scaling does eventually solve the reliability problem, it means that very plausibly people are overestimating how far along capabilities are, and how fast the rate of progress is, because the most impressive thing that can be done with 90% reliability plausibly advances faster than the most impressive thing that can be done with 99.9% reliability
don't worry too much about doing things right the first time. if the results are very promising, the cost of having to redo it won't hurt nearly as much as you think it will. but if you put it off because you don't know exactly how to do it right, then you might never get around to it.
i predict that on Jan 1 2029, neither openai nor anthropic will be near-fully automated, by which i mean <=5 people are even plausibly making important decisions (like, everyone else could go on vacation and it would not slow the company down at all). Celestia predicts otherwise
if WWIII happens, resolves NA. if a localized Taiwan war happens but doesn't escalate to WWIII, the bet is still on. if there's a big recession, the bet is still on.
it feels so narratively incongruous that san francisco would become the center for the most ambitious, and the likely future birthplace of agi.
san francisco feels like a city that wants to pretend to be a small quaint hippie town forever. it's a small splattering of civilization dropped amid a vast expanse of beautiful nature. frozen in amber, it's unclear if time even passes here - the lack of seasons makes it feel like a summer that never quite ended. after 9pm, everything closes and everyone goes to bed. and the dysfunction of the city government is never too far away, constantly reminding you of humanity's follies next to the perfection of nature.
on the other hand, nyc feels like the city. everything is happening right here, right now. all the money in the world flows through this one place. it's gritty and yet majestic at the same time. the most ambitious people in the world came here to build their fortunes, and live on in the names on the skyscrapers everywhere that house the employees who continue to keep their companies running. they are part of a surroundings that is entirely constructed by man - even the bits of nature are curated and parcelled out in manageable units. i...
Cali is the place to be for technology because Cali was the defense contractor hub, with the U.S. Navy choosing the bay area as its center for R&D during WWII and the Cold War. The hippie reputation came a lot later, after its status as the primary place to work in IT was thoroughly cemented, with both established infrastructure and the network effect keeping it that way.
HP, for instance, was founded in 1938.
It's not just SF but the SF Bay Area (Google, Nvidia, Meta etc), which is bigger and has more varied vibes than just SF.
I'm very confused why purchasing power varies so dramatically internationally. like why are there countries where everyone has very low wages but everything is also really cheap so it balances out? prima facie, huge disparities like this should get evened out by arbitrage.
the simple explanation is that some labor can only be performed locally, labor mobility is limited (immigration laws, people don't like moving, etc), and transportation costs for goods exist (shipping and tariffs).
however, global shipping is ridiculously cheap. and the economy increasingly consists of white collar jobs which could in theory be done remotely. for example, it seems it mind boggling to me that a top tier SWE/RS in the bay area is worth 10-100x more than one in India or Vietnam. like sure, someone being in the same timezone is great, and Zoom sucks, and so on. but for that price delta surely you could pay people to live nocturnally, construct apartments with bright lights synced to Pacific Time, invest in much better video call technology like that Google Beam thingy, etc?
maybe one possibility is that labor mobility is not actually that low for the very toppest tier people, and so if someone is actual...
In what sense does it actually balance out? e.g. in India, unskilled labor is a lot cheaper, so lots of upper middle class people have servants. But the price of an iPhone in India is pretty similar to the price of an iPhone in the US.
So my impression is that the typical basket of goods and services that people consume in different places around the world at roughly equivalent / analogous relative economic classes actually does vary quite a bit. Anything with a labor component will naturally scale up and down for balance, but staples and stuff made in factories doesn't vary that much. In the US for example, labor of all types is very expensive, so people don't have servants, but most people can afford a pretty much endless supply of trinkets and gadgets.
I worked on a international team during my time at F5 and we had offices in Ireland, Poland, two timezones in the US, Australia and India. The assumption that we could teleconference our way out of geography was a laughable failure for one reason that your hypothetical "nocturnal white collar sweatshops" fails to address: Humans work to live, we don't live to work. Well, most of us that is, and the unbalanced folks (the 10x engineers as they are now called) who would work across timezones burned out dramatically (I was one of them). Why are silicon valley jobs so lucrative but also cost of living so high? Because the people there have children in schools, they socialize with people outside their work and they generally live a life not just work. So how does this play out in a workplace?
Engineering planning has to happen at some hour, it is naturally inconvenient for outliers (Poland is meeting at 7pm thinking about how they missed dinner with their kids, and the engineers from Delhi are up at 11:30pm likely sneaking a nap in before the meeting, and the team in Seattle is just finishing their morning coffee). This creates a situation where both sides of the distribution are ov...
I don't understand it either. I work in Germany with near-shoring colleagues in Slovakia, Serbia, Georgia etc. They are roughly 60% cheaper than German SWEs, generally just as competent, no time-zone problems whatsoever ... basically all the work even with German team members is fully remote so even that is not a difference. Only the need for English creates some minor friction. No idea how this state of affairs makes economic sense now or ever did.
in some way, bureaucracy design is the exact opposite of machine learning. while the goal of machine learning is to make clusters of computers that can think like humans, the goal of bureaucracy design is to make clusters of humans that can think like a computer
some thoughts on the short timeline agi lab worldview. this post is the result of taking capabilities people's world models and mashing them into alignment people's world models.
I think there are roughly two main likely stories for how AGI (defined as able to do any intellectual task as well as the best humans, specifically those tasks relevant for kicking off recursive self improvement) happens:
while I usually think about story 1, this post is about taking story 2 seriously.
it seems basically true that current AI systems are mostly align...
certainly if AI systems were only ever roughly this misaligned we'd be doing pretty well.
I think this is an important disagreement with the "alignment is hard" crowd. I particularly disagree with "certainly."
The question is "what exactly is the AI trying to do, and what happens if it magnified it's capabilities a millionfold and it and it's descendants were running openendedly?", and are any of the instances catastrophically bad?
Some things you might mean that are raising your position to "certainly" (whereas I'd say "most likely not, or, it's too dumb to even count as 'aligned' or 'misaligned'")
Were any of those what you meant? Or are you thinking about it an entirely different way?
I would naively expect, if you took LLM-agents current degree of alignment, and ran a lotta copies trying to help you with end-to-end alignment research with dialed up capabilities, at least a couple instances would end up trying to subtle sabotage you and/or escape.
my referral/vouching policy is i try my best to completely decouple my estimate of technical competence from how close a friend someone is. i have very good friends i would not write referrals for and i have written referrals for people i basically only know in a professional context. if i feel like it's impossible for me to disentangle, i will defer to someone i trust and have them make the decision. this leads to some awkward conversations, but if someone doesn't want to be friends with me because it won't lead to a referral, i don't want to be friends with them either.
Strong agree (except in that liking someone's company is evidence that they would be a pleasant co-worker, but that's generally not a high order bit). I find it very annoying that standard reference culture seems to often imply giving extremely positive references unless someone was truly awful, since it makes it much harder to get real info from references
(this is based on / expanded from a response I wrote to a tweet that was talking about how autistic people struggle in the world because the world follows unwritten rules that are more important than the written ones.)
I think most autistic people should invest more in understanding the unwritten rules. it can be cruel and unfair, but it's important to know how to interact with it. and it's actually a really interesting system to map out, with its own rhyme and reason.
it's entirely understandable that people feel burned by bad past experiences, and to have learned helplessness from bullying or other unfair treatment. this kind of thing leaves a scar and can make it feel viscerally hopeless.
but it still feels defeatist to just throw up one's hands and say "it's too complicated." yes, it's complicated and fuzzy and initially unintuitive and takes years to master. so is ML research. the point of being intelligent is that you are good at finding patterns and learning things, and there's nothing truly fundamentally different about the unwritten rules of social interaction.
I see people taking examples of weird unintuitive social rules all the time and, tbh, none of them are truly that com...
I don't know about other autists, but my primary problem with the neurotypical world isn't that I don't understand it, it that they don't understand me. It doesn't matter how well I can decode the social norms, if I can't also control my unvoluntary emotional expressions, and also do other things ranging from impossible to unpleasant.
I do understand social white lies. It's not that complicated. But I still find it unpleasant to speak them. When I was younger I got into trouble for literally being unable to utter words like "thanks" and "apology" when I did not mean them. (My native language does not have the ambiguous "sorry".) I am now able to tell white lies, but it makes me feel bad, in a way that has nothing to do with morals. The dissonance is just intrinsically hurtful to my sole, in a way that non-autistic people don't understand and typically don't respect.
Another common thing is that people assume that if I don't succeed in hiding my negative emotion this is an invitation/request for them to to try to help me, and then proceed to try to do that, even though they have zero skills, in this. And then they refuse to listen to anything I say, including not leaving me alon...
i think one really bad dynamic in this community is a sort of purity testing about being x risk pilled. it feels like people are constitutionally scared of considering arguments that feel like they're arguing that things might be fine in any way. tbc I've definitely been guilty of this in the past, and probably still now to some extent, but i think it's bad. maybe there is some conflict theory reason why you should orient yourself this way towards people who have an ulterior motive, but I'm like among the most x risk pilled people out there and i still find this happens when i try to discuss x risk with people.
It doesn't feel that way to me fwiw. I feel like lots of people I know including myself have made arguments that things might be fine. For example the salty, cynical, John Wentworth wrote "Alignment by Default." Also, see AI 2027 Slowdown ending.
Now, if xriskpilled means: You think there's a >5% chance of literal extinction (or similarly bad outcomes) due to misaligned AIs, then yeah I think I do kinda judge people who aren't xriskpilled in that sense, because I think believing the chance is <5% is extremely unjustified once you know a decent amount about the situation and the evidence.
I think part of the reason for people leaning so heavily on x-risk arguments is that the alternative (that they created machines that are somewhat uniquely destructive to labor, are incredible for surveillance, destroyed the free software movement, and ruin something that absolutely all of them love, the internet, out of a mistaken belief that if they didn't, a paperclip maximization machine would be conjured into existence by someone else, and it's beneficial to be the first one to make the paperclip maximization machine) requires them to stare in the face the rather dreadful implications of their actions, and figure out a way to salvage them.
When operating in an emergency mode, you sort of get to ignore ethics of anything immediate in favor of the hypothetical ethics of something that may or may not come. That's why Anthropic allowed itself to be integrated into Project Maven (responsible for thousands of deaths until it became a PR issue), why they automatically offer all of their models to the NSA, why they constantly push for unethical sanctions on China, why they're paying billions to Musk, and why they turned Bun into a security nightmare: Because you gotta go fast, or you'll lose. And losing is bad. For x-risk reasons. They've created a maximizer for GPUs and market valuation instead of paperclips.
X-risk is poison for consequentialists; a complete hypothetical encouraging them to create the ultimate puppy-kicking machine.
Centralization is the natural conclusion, yes. The future that frontier labs are pushing toward is one of centralization of power, into the hands of people who have demonstrated little but carelessness. Anthropic's early talent pool was seeded with Alameda Research (its largest investor circa-2023) employees; its Series B was led by a person who was deeply wrong about risk calculation and sloppy in execution with stolen money.
This is not existential risk, this is just risk. A risk that these companies (and governments, at their behest) are walking into wholeheartedly, knowingly, and happily. These companies are sacrificing the future using the language of utopianism while repeatedly advocating for centralization of power, which will only cement and calcify power structures.
If these people were acting in your best interest, they wouldn't have been taking the same sloppy, anti-moral, deeply-flawed actions at Alameda, in an entirely different domain than safety research. They wouldn't be repeatedly changing their RSP, removing all of its teeth. They wouldn't be doing what they're doing. They are telling you the world is ending to acquire casus belli, not to try and make it possible to...
how valuable are formalized frameworks for safety, risk assessment, etc in other (non-AI) fields?
i'm temperamentally predisposed to find that my eyes glaze over whenever i see some kind of formalized risk management framework, which usually comes with a 2010s style powerpoint diagram - see below for a random example i grabbed from google images:
am i being unfair to these things? are they actually super effective for avoiding failures in other domains? or are they just useful for CYA and busywork and mostly just say common sense in an institutionally legible way?
one reason i care is because i feel some level of instinctive dislike for some AI safety/governance frameworks because they give me this vibe. but it's useful to figure out if i'm being unfairly judgemental, or if these really are slop.
i find it very interesting how becoming familiar with a place makes it feel so different, and yet it's recognizably the same place. especially when the first time you see the place you just think of it as a disconnected location floating around somewhere in abstract locationspace, but then slowly discover its relationship to other locations. sometimes this feels exhilarating, because the world feels more cohesive and whole and familiar. other times it's melancholy, because it feels like the magic has gone.
I think another reason why people procrastinate is that it makes each minute spent right before the deadline both obviously high value on net and resulting in immediate payoff. this makes the decision to put in effort in each moment really easy - obviously it makes sense to spend a minute working on something that will make a big impact on tomorrow. whereas each minute long before the deadline has longer time till payoff, and if you already put in a ton of work early on, then the minutes right before the deadline have lower marginal value because of diminishing returns. so this creates a perverse incentive to end-load the effort
i have a theory that a lot of people go through "emotional healing" only to end up as still-broken people who now have "being healed" as a big part of their identity that lets them feel superior to other people who are less far along the chosen path than they are. ofc, there are also people who are actually emotionally in touch. an easy way to distinguish such people is to notice how you feel around them. do they make you feel more calm and grounded, or do they make you unhappy and defensive?
ilya's AGI predictions circa 2017 (Musk v. Altman, Dkt. 379-40):
Within the next three years, robotics should be completely solved, AI should solve a long-standing unproven theorem, programming competitions should be won consistently by Als, and there should be convincing chatbots (though no one should pass the Turing test). In as little as four years, each overnight experiment will feasibly use so much compute compute that there's an actual chance of waking up to AGI, given the right algorithm - and figuring out the algorithm will actually happen within 2-4 further years of experimenting with this compute in a competitive multiagent simulation.
[...]
Each year, we'll need to exponentially increase our hardware spend, but we have reason to believe AGI can ultimately be built with less than $10B in hardware.
one big problem with using LMs too much imo is that they are dumb and catastrophically wrong about things a lot, but they are very pleasant to talk to, project confidence and knowledgeability, and reply to messages faster than 99.99% of people. these things are more easily noticeable than subtle falsehood, and reinforce a reflex of asking the model more and more. it's very analogous to twitter soundbites vs reading long form writing and how that eroded epistemics.
hotter take: the extent to which one finds current LMs smart is probably correlated with how much one is swayed by good vibes from their interlocutor as opposed to the substance of the argument (ofc conditional on the model actually giving good vibes, which varies from person to person. I personally never liked chatgpt vibes until I wrote a big system prompt)
learning thread for taking notes on things as i learn them (in public so hopefully other people can get value out of it)
VAEs:
a normal autoencoder decodes single latents z to single images (or whatever other kind of data) x, and also encodes single images x to single latents z.
with VAEs, we want our decoder (p(x|z)) to take single latents z and output a distribution over x's. for simplicity we generally declare that this distribution is a gaussian with identity covariance, and we have our decoder output a single x value that is the mean of the gaussian.
because each x can be produced by multiple z's, to run this backwards you also need a distribution of z's for each single x. we call the ideal encoder p(z|x) - the thing that would perfectly invert our decoder p(x|z). unfortunately, we obviously don't have access to this thing. so we have to train an encoder network q(z|x) to approximate it. to make our encoder output a distribution, we have it output a mean vector and a stddev vector for a gaussian. at runtime we sample a random vector eps ~ N(0, 1) and multiply it by the mean and stddev vectors to get an N(mu, std).
to train this thing, we would like to optimize the following loss function:
-log p(x) + KL(q(z|x)||p(z|x))
where the terms optimize the likelihood (how good is the VAE at modelling dat...
traveling through Europe, looking out the window, and seeing the national flag flying next to the flag of the EU fills me with a strange feeling. this isn't an original thought at all, but still: it's really crazy that just 50 years ago Europe was divided by the iron curtain, and that people would have to go to insane lengths and risk their lives to get across that border; and that less than 100 years ago all of these countries were at war with each other, and had been at war on and off for centuries with ever shifting alliances and boundaries.
I know this is one of the universal human experiences, but I keep getting unpleasantly reminded by the passage of time.
pleasant "recent" memories are already one or two years ago. they feel recent enough that I still stubbornly believe my recollection is accurate, but in reality they're far enough away from the present day for the sepia tint of nostalgia to creep in and for all the frustrations and sorrows to be forgotten. no wonder it's so hard for the present to compete with the "recent" past.
I sometimes ask myself when I first met one my "recent" friends, and am startled to realize that I met them 2 or 3 or 4 years ago. "oh yeah, I met him 'recently' at that one party FOUR FUCKING YEARS AGO."
I still can't wrap my mind around the fact that Iater this year I will have been at openai for 5 years. I first started following ML about 10 years ago, so I will soon have spent more time at openai than I have spent reading openai papers from the outside, and thinking of openai as a far away citadel in a different universe.
where did all the time go?
when people say that (prescription) amphetamines "borrow from the future", is there strong evidence on this? with Ozempic we've observed that people are heavily biased against things that feel like a free win, so the tradeoff narrative is memetically fit. distribution shift from ancestral environment means algernon need not apply
(I'm a psychiatry resident. I also have ADHD and take prescription stimulants infrequently)
The answer is: not really, or at least not in a meaningful sense. You aren't permanently losing anything, your brain or your wellbeing isn't being burnt out like a GPU running on an unstable OC:
They definitely do for me- I sleep worse that night, and if I use too frequently I get exhaustion that takes weeks to recover from.
a take I've expressed a bunch irl but haven't written up yet: feature sparsity might be fundamentally the wrong thing for disentangling superposition; circuit sparsity might be more correct to optimize for. in particular, circuit sparsity doesn't have problems with feature splitting/absorption
the most valuable part of a social event is often not the part that is ostensibly the most important, but rather the gaps between the main parts.
One of the directions im currently most excited about (modern control theory through algebraic analysis) I learned about while idly chitchatting with a colleague at lunch about old school cybernetics. We were both confused why it was such a big deal in the 50s and 60s then basically died.
A stranger at the table had overheard our conversation and immediately started ranting to us about the history of cybernetics and modern methods of control theory. Turns out that control theory has developed far beyond whay people did in the 60s but names, techniques, methods have changed and this guy was one of the world experts. I wouldn't have known to ask him because the guy's specialization on the face of it had nothing to do with control theory.
i want someone to make the one true categorization of Types of Guy. MBTI is an ok start, but there are so many things it doesn’t even try to explain. like for example if i see someone has very scrunched up body language and talks very quickly, this correlates very strongly with a bunch of other traits, like talking in conversation with long turn lengths.
my theory for why the literature here is kinda terrible is that most people either like people, in which case they mostly just develop an intuitive model of people; or they like systematizing, in which case they become obsessed with trains. few people are systematizing but obsessed with people.
a lot of unconventional people choose intentionally to ignore normie-legible status systems. this can take the form of either expert consensus or some form of feedback from reality that is widely accepted. for example, many researchers especially around these parts just don't publish at all in normal ML conferences at all, opting instead to depart into their own status systems. or they don't care whether their techniques can be used to make very successful products, or make surprisingly accurate predictions etc. instead, they substitute some alternative status system, like approval of a specific subcommunity.
there's a grain of truth to this, which is that the normal status system is often messed up (academia has terrible terrible incentives). it is true that many people overoptimize the normal status system really hard and end up not producing very much value.
but the problem with starting your own status system (or choosing to compete in a less well-agreed-upon one) is that it's unclear to other people how much stock to put in your status points. it's too easy to create new status systems. the existing ones might be deeply flawed, but at least their difficulty is a known quantity.
o...
This comment seems to implicitly assume markers of status are the only way to judge quality of work. You can just, y'know, look at it? Even without doing a deep dive, the sort of papers or blog posts which present good research have a different style and rhythm to them than the crap. And it's totally reasonable to declare that one's audience is the people who know how to pick up on that sort of style.
The bigger reason we can't entirely escape "status"-ranking systems is that there's far too much work to look at it all, so people have to choose which information sources to pay attention to.
It's a question of resolution. Just looking at things for vibes is a pretty good way of filtering wheat from chaff, but you don't give scarce resources like jobs or grants to every grain of wheat that comes along. When I sit on a hiring committee, the discussions around the table are usually some mix of status markers and people having done the hard work of reading papers more or less carefully (this consuming time in greater-than-linear proportion to distance from your own fields of expertise). Usually (unless nepotism is involved) someone who has done that homework can wield more power than they otherwise would at that table, because people respect strong arguments and understand that status markers aren't everything.
Still, at the end of day, an Annals paper is an Annals paper. It's also true that to pass some of the early filters you either need (a) someone who speaks up strongly for you or (b) pass the status marker tests.
I am sometimes in a position these days of trying to bridge the academic status system and the Berkeley-centric AI safety status system, e.g. by arguing to a high status mathematician that someone with illegible (to them) status is actually approximately equiv...
I claim it is a lot more reasonable to use the reference class of "people claiming the end of the world" than "more powerful intelligences emerging and competing with less intelligent beings" when thinking about AI x-risk. further, we should not try to convince people to adopt the latter reference class - this sets off alarm bells, and rightly so (as I will argue in short order) - but rather to bite the bullet, start from the former reference class, and provide arguments and evidence for why this case is different from all the other cases.
this raises the question: how should you pick which reference class to use, in general? how do you prevent reference class tennis, where you argue back and forth about what is the right reference class to use? I claim the solution is you want to use reference classes that have consistently made good decisions irl. the point of reference classes is to provide a heuristic to quickly apply judgement to large swathes of situations that you don't have time to carefully examine. this is important because otherwise it's easy to get tied up by bad actors who avoid being refuted by making their beliefs very complex and therefore hard to argue against.
the b...
This all seems wrongheaded to me.
I endeavor to look at how things work and describe them accurately. Similarly to how I try to describe how a piece of code works, or how to to build a shed, I will try to accurately describe the consequences of large machine learning runs, which can include human extinction.
I personally think AGI will probably kill everyone. but this is a big claim and should be treated as such.
This isn't how I think about things. Reality is what exists, and if a claim accurately describes reality, then I should not want to hold it to higher standards than claims that do not describe reality. I don't think it's a good epistemology to rank claims by "bigness" and then say that the big ones are less likely and need more evidence. On the contrary, I think it's worth investing more in finding out if they're right, and generally worth bringing them up to consideration with less evidence than for "small" claims.
...on the other hand, everyone has personally experienced a dozen different doomsday predictions. whether that's your local church or faraway cult warning about Armageddon, or Y2K, or global financial collapse in 2008, or the maximally alarmist climate people, o
I think the group of people "claiming the end of the world" in the case of AI x-risk is importantly more credentialed and reasonable-looking than most prior claims about the end of the world. From the reference class and general heuristics perspective that you're talking about[1], I think how credible looking the people are is pretty important.
So, I think the reference class is more like claims of nuclear armageddon than cults. (Plausibly near maximally alarmist climate people are in a similar reference class.)
IDK how I feel about this perspective overall. ↩︎
It seems you are having in mind something like inference to the best explanation here. Bayesian updating, on the other hand, does need a prior distribution, and the question of which prior distribution to use cannot be waved away when there is a disagreement on how to update. In fact, that's one of the main problems of Bayesian updating, and the reason why it is often not used in arguments.
you might expect that the butterfly effect applies to ML training. make one small change early in training and it might cascade to change the training process in huge ways.
at least in non-RL training, this intuition seems to be basically wrong. you can do some pretty crazy things to the training process without really affecting macroscopic properties of the model (e.g loss). one very well known example is that using mixed precision training results in training curves that are basically identical to full precision training, even though you're throwing out a ton of bits of precision on every step.
people often say that limitations of an artistic medium breed creativity. part of this could be the fact that when it is costly to do things, the only things done will be higher effort
any time someone creates a lot of value without capturing it, a bunch of other people will end up capturing the value instead. this could be end consumers, but it could also be various middlemen. it happens not infrequently that someone decides not to capture the value they produce in the hopes that the end consumers get the benefit, but in fact the middlemen capture the value instead
an example: open source software produces lots of value. this value is partly captured by consumers who get better software for free, and partly by businesses that make more money than they would otherwise.
the most clear cut case is that some businesses exist purely by wrapping other people's open source software, doing advertising and selling it for a handsome profit; this makes the analysis simpler, though to be clear the vast majority of cases are not this egregious.
in this situation, the middleman company is in fact creating value (if a software is created in a forest with no one around to use it, does it create any value?) by using advertising to cause people to get value from software. in markets where there are consumers clueless enough to not know about the software otherwise (e.g legacy companies), this probably does actually create a lot of counterfactual value. however, most people would agree that the middleman getting 90% of the created value doesn't satisfy our intuitive notion of fairness. (open source developers are more often trying to have the end consumers benefit from better software, not for random middlemen to get rich off their efforts)
and if advertising is commoditized, then this problem stops existing (you can't extract that much value as an advertising middleman if there is an efficient market with 10 other competing middlemen), and so most of the value does actually accrue to the end user.
saying "sorry, just to make sure I understand what you're saying, do you mean [...]" more often has been very valuable
i wonder how much the following hypothesis is true. it's obviously not completely true, but maybe there is some value in contemplating it.
for most issues that people have expended a lot of effort arguing about, if you could truly impartially reason through it, run experiments, etc, the correct answer is either not that hard to figure out, or that we're pretty sure we can't know one way or the other. but the discourse is fucked because the vast majority (maybe literally all) of people have some bottom line they're already sympathetic to, and smart people can make plausible sounding arguments for any conclusion, and so truly reasoning things through impartially is both very hard to do and even if you somehow manage to do it, it's hard to signal that you did so, and anyways people with motivated reasoning will only listen to you if your answer happened to agree with theirs.
the main evidence i have for this hypothesis is that there are questions where one side is overwhelmingly obviously correct if you actually think about it or look into it, and yet, there is the appearance of a balanced public debate.
the main evidence against this hypothesis is that probably people who disagree also think they're overwhelmingly obviously correct, and it seems arrogant to declare that i am simply more correct, and in any case oftentimes people disagreeing are wrong but their disagreement still contains enough of a kernel of truth to be worth thinking about.
corollary: the ability to look upon difficult controversial problems with utter naivety is extremely valuable. it's probably bad to always be in this state, because you can get pwned by bad actors. but cultivating an ability to enter into this state, possibly through a community where people value this a lot, is extremely valuable (LW is the closest thing to this that exists in the world afaik, but please let me know if there is anywhere else that is better)
i love the concept of upvote-disagree. this feature singlehandedly encapsulates a lot of what i like about LW
conlang idea: an extremely easy to learn language with the following attributes:
people love to hate on brutalism. my take is there something really aesthetic about it, but also that just because something is aesthetic doesn't mean that it would be a good place to live or work every day. in fact, I've unfortunately found that the environments that I find aesthetically pleasing and the environments that make me happy to live/work in diverge quite substantially.
it's kind of crazy that spaced repetition has completely revolutionized language learning and then not really changed the world in any other way at all. why are there no great scientists who are inhumanly good at remembering the corpus of their field through incremental reading? why are there no insanely good engineers who know every detail of their entire stack through spaced repetition?
i'm going to rerun the neurips agi experiment this year. place your bets on what fraction of people at neurips this year know what the acronym AGI stands for!
where are all the people trying to understand how the world works? (in a broad sense that is useful for understanding the trajectory of the world: e.g things like why is society the way it is, why do people behave the way they do; why has technology developed the way it has, etc.; as opposed to zooming in and specializing, e.g fundamental physics research or biomed or whatever) there are a bunch of people like this in the rationalist sphere but i'm curious where all the non-rationalist-adjacent such people are. it seems many people in the broader world are either uncurious or mindkilled on such questions.
Some books you might like to read:
All of these books to various degrees tackle the things you are describing from a holistic perspective. Hope this helps.
hypothesis: there will be a window of time after the point of superhuman AI persuasion/charisma, during which human trust relationships will become extremely important. even once almost all human skills are obsolete, the AIs may have less social trust capital than humans. ofc, eventually, the persuasion will be so superhuman that it can cut through minds like butter, but that could take many years.
once AI superpersuasion is possible, there will be a strong incentive to use it to shape opinions. therefore, there will also be a strong incentive for important decision makers to develop strategies for making sure they are not being bamboozled.
thankfully, there is one way to not get bamboozled by a superpersuader - to not listen to it in the first place. this is an age old idea. many social strategies have evolved to help people avoid talking to other superpersuading people. for example:
idea for a conference:
i want to host a conference which is kind of a cross between an unconference and a hackathon. the goal would be to create an environment where people can spontaneously do random side projects.
in the same way that Minecraft teaches you to exercise agency and Factorio teaches you to optimize, are there any games that teach you to stare into the abyss? the ideal game would (a) reward you on a tight feedback loop for constantly admitting that you were wrong, (b) give you the option to not admit that you were wrong but make that decision acutely hurt. pastcasting is good for (a) but not good for (b) because you are sort of forced to confront being wrong all the time, which maybe teaches you that it doesn't feel as bad as you might expect, but it doesn't teach you to intentionally seek out things that could prove you wrong; and you don't really have time to develop an attachment to your wrong ideas. most normal games reward you for staring into the abyss very indirectly because being good at intentional practice makes you do better over the very long run, but you don't get immediate feedback loops for it, and so it's easy to just not realize you could be doing a lot better.
Chess. Mistakes in chess usually become noticeable quickly, in just a move or two, and you have no RNG or teammates to blame them on. But to get better you have to acknowledge your mistakes and avoid making the same mistakes again.
Play against a strong chess engine while allowing yourself to undo as many moves as you like at any time and try to find any winning game?
my life would often be better if I exercised more agency. why don't I do so more often? here is a taxonomy of reasons I've noticed:
there seem to be three different possible levels of manager involvement in individual researchers:
a great way to get someone to dig into a position really hard (whether or not that position is correct) is to consistently misunderstand that position
almost every single major ideology has some strawman that the general population commonly imagines when they think of the ideology. a major source of cohesion within the ideology comes from a shared feeling of injustice from being misunderstood.
hypothesis: intellectual progress mostly happens when bubbles of non tribalism can exist. this is hard to safeguard because tribalism is a powerful strategy, and therefore insulating these bubbles is hard. perhaps it is possible for there to exist a monopoly on tribalism to make non tribal intellectual progress happen, in the same way a monopoly on violence makes it possible to make economically valuable trade without fear of violence
i feel like the fundamental mistake the project of rationality made was that "cognitive biases" is not in practice the right way to think about the way humans are irrational if your goal is to be very instrumentally rational. one hypothesis is the correct frame is to first deeply understand how the emotional system works, and then to think about ways to master that system to achieve rationality.
(yes, i know that buried somewhere in the sequences it says something like "humans aren't ideal intelligences with cognitive biases bolted on. we are the cognitive biases, they are just trying to approximate rationality".)
By the time I went to CFAR in 2019 this felt like it had already become the dominant flavor of inner-circle rationalist thinking, but then that inner circle kind of petered out in influence. The person I see carrying that torch most loudly in my current social atmosphere is Chris Lakin.
But overall rationality has been kind of quiescent imo! Ray posts good stuff, Duncan has his own thing, but it feels like we went from mid-2010s “rationalists talk a big game but don’t get anything done” to the mid-2020s most influential rationalists being too object-level busy to blog much about this metacognitive stuff.
If we understand "irrational" to mean something like "underperforming relative to what should be feasible", then I think one significant piece of the puzzle is the regime of "very impoverished hypothesis spaces". In this regime, Bayes (and deviations from it) is more of a peripheral conceptual frame (allowing you to understand some edge cases and some basics and some constraints) than a central guide or even that much of a useful tool. A much more important question is about hypothesis generation, aka abduction or "unupdating", i.e. expanding your hypothesis space. Another piece of the puzzle is that hypotheses are nothing like full possible worlds (as in many elementary models of epistemology), but rather are very-partial-possible-world-parts. Yet another piece of the puzzle is that concepts are very much not only or even mainly about prediction, propositions, and explanation (narrowly construed), but rather mainly about manipulation (including mental manipulation, e.g. "how could I have thought that faster or more efficiently"). Understanding how to think well in this regime, specifically in cases where you don't already have all / almost all of the understanding you need to win ...
an idea that i associate with bay area strains of buddhism is something like "life is just a series of distractions, you are distracted from distraction only by even more distractions, and it's all because you are experiencing suffering that is too uncomfortable to focus on, so if you somehow dispel all of it you are confronted with the existential dread of the impermanence of life, and dispelling that is the final boss." i might be completely misunderstanding something, I'm not a Buddhism expert by any means, please correct me if I'm wrong.
I've updated a lot towards something like this being at least kind of true. it seems like at least for a certain neurotype of person, much of one's behavior (ambition, addiction, hedonism, status, socialization, etc) serves as a way to distract from some kind of emotional pain. it looks different in different people; sometimes it looks like working a zillion hours so you have no time to reflect; or making status/money go up for its own sake; or drinking heavily; or playing video games; or spending lots of time at social events. the commonality is escaping the experience of the present in some way. in much the same way that you will flinch when o...
the part i most disagree with is the part where you're supposed to dissolve the pain of impermanence and fear of death. maybe many other fears should be dissolved, but it is good that impermanence is uncomfortable! you should be afraid of death and fill the void of existential dread with an ambition to end death forever.
You can work to end something without being afraid of it or finding it uncomfortable. A programmer looking for the cause of a bug in software is usually not afraid of the bug. If they did keep flinching away from the thought of the bug and finding the whole debugging thing uncomfortable, they'd do a worse rather than a better job at debugging.
If you want to end death, you'll do a better job at it if you can think about it clearly and not flinch away from considering things suggesting you personally might not make it. This requires not being afraid of it.
man vaccines are so fucking cool. it's awesome that there are like a dozen horrifyingly painful and deadly diseases that i will almost certainly never get in my life. i wish i could get vaccinated against literally everything
There's a statue of him in Los Angeles's Little Tokyo which I used to pay respects to when I visited for the New Year's festival. As I became an EA I would aspire to match or exceed his impact.
Sugihara continued to hand-write visas, reportedly spending 18 to 20 hours a day on them, producing a normal month's worth of visas each day, until September 4, 1940, when he had to leave his post before the consulate was closed. Sugihara reportedly worked at a quick pace and aimed to issue 200 to 300 visas each day. [...]
According to witnesses, he was still writing visas while in transit from his hotel and after boarding the train at Kaunas railway station, throwing visas into the crowd of desperate refugees out of the train's window even as the train pulled out.

Your economics are wrong for a few reasons. Let's grant the hypothetical where all humans supply homogeneous labor at a uniform wage.
The real fiscal issue in this scenario is that you are shifting output from labor to capital, and the tax rate on capital is lower than the tax on labor. (Moreover as you automate the economy there are further corporate reorganizations that would drive effect tax rates w...
theory: a large fraction of travel is because of mimetic desire (seeing other people travel and feeling fomo / keeping up with the joneses), signalling purposes (posting on IG, demonstrating socioeconomic status), or mental compartmentalization of leisure time (similar to how it's really bad for your office and bedroom to be the same room).
this explains why in every tourist destination there are a whole bunch of very popular tourist traps that are in no way actually unique/comparatively-advantaged to the particular destination. for example: shopping, amusement parks, certain kinds of museums.
from To the Success of our Hopeless Cause: interestingly, a big tension in the Soviet dissident movement was between people who believed in being 100% virtuous, embracing martyrdom, signing their names and addresses onto their dissenting samizdat texts, protesting to be arrested 5 minutes later and sent to jail, pretending that the letter of the Soviet law actually mattered, etc, vs people who believed in being more strategic and openly illegal and trying to avoid being caught. the former fades in importance because they keep getting arrested (the 1968 red square protest being tbe turning point).
idea: flight insurance, where you pay a fixed amount for the assurance that you will definitely get to your destination on time. e.g if your flight gets delayed, they will pay for a ticket on the next flight from some other airline, or directly approach people on the next flight to buy a ticket off of them, or charter a private plane.
pure insurance for things you could afford to self insure is generally a scam (and the customer base of this product could probably afford to self insure) but this mostly provides value by handling the rather complicated logistics for you rather than by reducing the financial burden, and there are substantial benefits from economies of scale (e.g if you have enough customers you can maintain a fleet of private planes within a few hours of most major airports)
it's often stated that believing that you'll succeed actually causes you to be more likely to succeed. there are immediately obvious explanations for this - survivorship bias. obviously most people who win the lottery will have believed that buying lottery tickets is a good idea, but that doesn't mean we should take that advice. so we should consider the plausible mechanisms of action.
first, it is very common for people with latent ability to underestimate their latent ability. in situations where the cost of failure is low, it seems net positive to at least take seriously the hypothesis that you can do more than you think you can. (also keeping in mind that we often overestimate the cost of failure). there are also deleterious mental health effects to believing in a high probability of failure, and then bad mental health does actually cause failure - it's really hard to give something your all if you don't really believe in it.
belief in success also plays an important role in signalling. if you're trying to make some joint venture happen, you need to make people believe that the joint venture will actually succeed (opportunity costs exist). when assessing the likelihood of success...
hot take: analogies should not be used as evidence for positions, except as the weakest form used to privilege an otherwise arbitrary hypothesis to any consideration at all. otherwise, they should be used purely as a way to effectively convey a hypothesis, but the actual evidence needs to come from something other than the mere analogy itself.
hot response: all evidence is analogy, it's just a matter of degree. Maybe a heuristic like this is a good way to motivate gathering closer, more appropriate evidence, the better to increase confidence.
i can name many examples of evidence that are not analogy. perhaps they're arguably "just" analogy, but it would be in an obviously boring way. like how there is "just" subjective experience because you can never really directly observe reality, and yet clearly there is a difference between empirical evidence and a priori reasoning (note that the evidence for this analogy comes from the examples below, rather than the analogy itself)
It seems to me that you are greatly broadening (that is, redefining) the word “analogy” to mean any sort of approximation. For example, survey results are not an analogy to the ground truth but an approximation to it.
my summary of these two papers: https://arxiv.org/pdf/1805.12152 https://arxiv.org/pdf/1905.02175
the first paper observes a phenomenon where adversarial accuracy and normal accuracy are at odds with each other. the authors present a toy example to explain this.
the construction involves giving each input one channel that is 90% accurate for predicting the binary label, and a bazillion iid gaussian channels that are as noisy as possible individually, so that when you take the average across all of them you get ~100% accuracy. they show that when you do -adversarial training on the input you learn to only use the 90% accurate feature, whereas normal training uses all bazillion weak channels.
the key to this construction is that they consider an -ball on the input (distance is the max across all coordinates). so this means by adding more and more features, you can move further and further in space (specifically, in terms of the number of features). but the distance between the means of the two high dimensional gaussians stays constant, so no matter what your is, at some point with enough channels you can pertur...
Is it a very universal experience to find it easier to write up your views if it's in response to someone else's writeup? Seems like the kind of thing that could explain a lot about how research tends to happen if it were a pretty universal experience.
it's actually insane how much of the entire economy is tech now. somewhere in the back of my head i still expected traditional "big" industries like big oil, or the banks, or whatnot to be the biggest. but i just realized this hasn't been true for a long time.

Looking at market cap is kinda misleading though; the public stock market is not the same thing as the economy, and tech is over-represented in market cap because of winner-take-all dynamics and margins.
Also, Amazon (setting aside AWS) is primarily a consumer goods and logistics company, and Tesla is a car and battery manufacturer - they're gigantic in part because they use tech well, but the actual goods and economic activity they generate aren't exactly "tech".
i haven't had a chance to think deeply about it but vibes wise i don't like activation oracles
i think it's really weird that people are trying to do vaguely interp flavored things but also trying to argue for the goodness of such techniques via empirical usefulness. i think there are broadly two self consistent worldviews here. one is that you want to understand how NNs actually work and then use that understanding for something. the other is you want to make models better at X (where X can be anything from "be a good chatgpt model" to "refuse bioweapon prompts" to "make weak to strong setup score go up"). but if you're doing the latter the actual conceptually important part is picking the right X and then working really hard to make it go up using whatever techniques work. if you're doing the former you should actually try to understand things whatsoever. it doesn't make sense to try to do both and ultimately get neither. you should either do pragmatic or do interp.
The argument I would make is that you want to solve the practical problem, but you want to do so in a way that maximally scales with intelligence. And then white box techniques are more scalable than black box techniques, since schemers will predictably fool your black box techniques but not necessarily your white box techniques.
inside people with substantial internal conflict, their parts might even be less aligned/connected with each other than they are with other people. this probably has really weird effects
hot take: introspection isn't really real. you can't access your internal state in any meaningful sense beyond what your brain chooses to present to you (e.g visual stimuli, emotions, etc), for reasons outside of your direct control. when you think you're introspecting, what's really going on when you think you're introspecting is you have a model of yourself inside your brain, which you learn gradually by seeing yourself do certain things, experience certain stimuli or emotions, etc.
your self-model is not fundamentally special compared to any other models you have. it works the same way as your model of anyone or anything else, except you have way more data on yourself, and also you directly experience your own emotions and sensori stimuli, as opposed to having to infer them for other people. often your emotional brain sabotages your ability to understand yourself, but also it sometimes sabotages your ability to understand other people too (e.g groupthink, tribalism).
your self-model can diverge arbitrarily far from reality. when you're emotionally unintegrated, you have a model of yourself that fails to understand how your emotions truly work, so you will systematically mispredict...
when i first came to the bay area, i was shocked that the silicon valley was literally just a bunch of suburbs and boring office parks. i used to think this was very incongruous. i still mostly do, but now i at least have a story for why the vibe isn't literally maximally incongruous.
one software architectureintuition i have is that legibility and standardization are utterly essential. the move from "servers as pets"(each one bespoke and carefully hand crafted, when something goes wrong, going in and fixing that one server) to "servers as cattle" (standardize all of the servers to be exact clones, destroy and recreate each server any time something deviates even slightly from identical rather than try to patch things on that one server by hand) is a good example. or for instance, you generally want to make things use common interfaces wherever possible, even if that means some slight awkwardness or indirectness, because the value of being able to interchange things is so high.
modern car oriented cities are a great example of taking this intuition and applying it to the physical world. steamrolling the unruly nature of old fashioned cities and reshaping everything to fit a handful o...
I have a little stored thought which sometimes triggers, and it reads:
"If you find yourself being forced to choose between two or more extremely bad options that involve burning your values, your resources, or your life, the truth is that you lost around three moves ago and are living out the equivalent of a forced mate in chess. You've already lost, so stop playing and find a better game to spend time on if at all possible."
what's the current state of analysis on whether the civil rights act of 1957 was actually net positive or negative for civil rights in hindsight? there are two possible stories one can tell, and at the time people were arguing about which is correct:
this feels directly analogous to the question of whether we should accept very weak AI safety regulations today.
there's an obvious synthesis of great man theory and broader structural forces theories of history.
there are great people, but these people are still bound by many constraints due to structural forces. political leaders can't just do whatever they want; they have to appease the keys of power within the country. in a democracy, the most obvious key of power is the citizens, who won't reelect a politician that tries to act against their interests. but even in dictatorships, keeping the economy at least kind of functional is important, because when the citizens are starving, they're more likely to revolt and overthrow the government. there are also powerful interest groups like the military and critical industries, which have substantial sway over government policy in both democracies and dictatorships. many powerful people are mostly custodians for the power of other people, in the same way that a bank is mostly a custodian for the money of its customers.
also, just because someone is involved in something important, it doesn't mean that they were maximally counterfactually responsible. structural forces often create possibilities to become extremely influential, but only in the direc...
one kind of reasoning in humans is a kind of instant intuition; you see something and something immediately and effortlessly pops into your mind. examples include recalling vocabulary in a language you're fluent in, playing a musical instrument proficiently, or having a first guess at what might be going wrong when debugging.
another kind of reasoning is the chain of thought, or explicit reasoning: you lay out your reasoning steps as words in your head, interspersed perhaps with visuals, or abstract concepts that you would have a hard time putting in words. It feels like you're consciously picking each step of the reasoning. Working through a hard math problem, or explicitly designing a codebase by listing the constraints and trying to satisfy them, are examples of this.
so far these map onto what people call system 1 and 2, but I've intentionally avoided these labels because I think there's actually a third kind of reasoning that doesn't fit well into either of these buckets.
sometimes, I need to put the relevant info into my head, and then just let it percolate slowly without consciously thinking about it. at some later time, insights into the problem will suddenly and unpredictably...
the possibility that a necessary ingredient in solving really hard problems is spending a bunch of time simply not doing any explicit reasoning
I have a pet theory that there are literally physiological events that take minutes, hours, or maybe even days or longer, to happen, which are basically required for some kinds of insight. This would look something like:
First you do a bunch of explicit work trying to solve the problem. This makes a bunch of progress, and also starts to trace out the boundaries of where you're confused / missing info / missing ideas.
You bash your head against that boundary even more.
Since there are basically no alignment plans/directions that I think are very likely to succeed, and adding "of course, this will most likely not solve alignment and then we all die, but it's still worth trying" to every sentence is low information and also actively bad for motivation, I've basically recalibrated my enthusiasm to be centered around "does this at least try to solve a substantial part of the real problem as I see it". For me at least this is the most productive mindset for me to be in, but I'm slightly worried people might confuse this for me having a low P(doom), or being very confident in specific alignment directions, or so on, hence this post that I can point people to.
I think this may also be a useful emotional state for other people with similar P(doom) and who feel very demotivated by that, which impacts their productivity.
for some reason, the irrational belief that nobody will read my shortforms paradoxically makes them much easier to write. if I'm writing something polished that i think lots of people will read, then i get scared that people will see it and think less of.me or something, which manifests as unreasonable perfectionism and a desire to present a fictitious version of myself and my thinking. i wonder if there is some way to get the best of both worlds - to produce more authentic but also high quality widely read posts
thoughts on what to work on in a world with heavy AI automation
it seems undeniable at this point that AI automation will play a huge role in all research in the medium term future. therefore, we should take automation into account when choosing what to work on. here are some possibilities, in decreasing order of how optimistic you are that models will be good at alignment research.
for people who are not very good at navigating social conventions, it is often easier to learn to be visibly weird than to learn to adapt to the social conventions.
this often works because there are some spaces where being visibly weird is tolerated, or even celebrated. in fact, from the perspective of an organization, it is good for your success if you are good at protecting weird people.
but from the perspective of an individual, leaning too hard into weirdness is possibly harmful. part of leaning into weirdness is intentional ignorance of normal conventions. this traps you in a local minimum where any progress on understanding normal conventions hurts your weirdness, but isn't enough to jump all the way to the basin of the normal mode of interaction.
(epistemic status: low confidence, just a hypothesis)
if any municipality in the bay area were to choose to allow lots of housing, then it would very quickly get manhattanized and they would make a zillion dollars in tax revenue, while harming property values in nearby cities. so naively you'd expect that surely eventually one random municipality of the dozens in the bay area will do this. but NIMBYs are so strong everywhere that this never happens. this seems directly relevant to questions of the feasibility of international coordination on AGI, especially if facilitated by strong pressure from labor to stop AGI.
the bay area is also evidence that we can't just assume that economically incentivized things are inevitable. the opportunity cost of not building up the bay area more is trillions of dollars. but people are willing to destroy immense amounts of value to preserve their self interest and the world they're familiar with.
California building code alone is quite restrictive. It's true that municipalities could allow building a lot more housing, but a lot of the cost comes from state-wide or even nation-level building codes.
any time there exists an activity that is (a) often but not always beneficial, (b) the supposed benefit is high status, and (c) the success of which is nontrivial to verify, then there will exist a bunch of people walking around who do the thing, and haven’t actually gained the intended benefit; nonetheless, they go around claiming the status benefits of doing the thing. often, they even genuinely believe they got the benefit. some examples:
the new openai planar unit distance result kills my last remaining doubts about AI being a huge multiplier on research productivity in the near term future. i was not expecting this to happen so soon; i would have guessed probably another year before we got a result like this.
i get the impression that the previous problems were mostly just neglected, or otherwise were less impressive than they seemed. whereas afaict mathematicians agree the new result is on a real well-known problem and genuinely surprising and novel.
why are malaria nets 9-23x more efficient than direct cash transfers? when in theory direct cash transfers can be used to purchase nets
some hypotheses
survey: what brand of melatonin do you use? i want to run an experiment on melatonin degradation using the most popular brand of melatonin.
I don't want to have to make every shortform a self contained article. it makes sense that full posts should explain the context, but I would find it very exhausting to have to e.g explain that I work at openai every single time I shortform post about anything openai related. if lesswrong shortform is the wrong place to do this, I'm happy to post elsewhere.
i'm thinking of starting a new blog. it would be about some amount of AI/alignment stuff of course, but also about lots of random other things. for instance, some blog post ideas:
thing i need your help with:
why is airplane wifi still so garbage? i imagine many business travellers would pay a lot for good airplane wifi.
presumably because to improve airplane wifi, you'd need to launch dozens of rockets to deliver a massive new constellation of orbiting satellites in order to deliver an order-of-magnitude improvement over Intelsat or whoever usually provides wifi connections to planes.
The good news is that SpaceX has done this, with their Starlink constellation! (Others like OneWeb, Baidu, and Amazon's Project Kuiper are also doing similar stuff.) But not every airline / airplane has upgraded to new Starlink recievers yet. So, most planes (and cruise ships, and etc) still have slow Intelsat/Globalstar internet, but others have indeed seen huge upgrades in internet speeds.
in defense of putting your python imports in the middle of your file (in global scope, not inside functions)
#pragma once to get python-like behaviorI mean, the proximate cause of the 1989 protests was the death of the quite reformist general secretary Hu Yaobang. The new general secretary, Zhao Ziyang, was very sympathetic towards the protesters and wanted to negotiate with them, but then he lost a power struggle against Li Peng and Deng Xiaoping (who was in semi retirement but still held onto control of the military). Immediately afterwards, he was removed as general secretary and martial law was declared, leading to the massacre.
a common discussion pattern: person 1 claims X solves/is an angle of attack on problem P. person 2 is skeptical. there is also some subproblem Q (90% of the time not mentioned explicitly). person 1 is defending a claim like "X solves P conditional on Q already being solved (but Q is easy)", whereas person 2 thinks person 1 is defending "X solves P via solving Q", and person 2 also believes something like "subproblem Q is hard". the problem with this discussion pattern is it can lead to some very frustrating miscommunication:
If you did that in the US, you'd end up trying to explain to prosecutors, then a judge, then a jury why you shouldn't be imprisoned for the rest of your life. I would hope that is roughly how it would go everywhere.
not OP, but that seems like a pretty reasonable conclusion. if i had to sacrifice my own life to save every person i didn't personally know (ie. 8.1 billion people), i would absolutely do it in a heartbeat. i would also do it to just save a fraction of those people (8M people). once it starts getting down to much smaller fractions (saving 100-3 random people) does it start seeming like a hard tradeoff.
have you ever heard anyone make the argument that it's good to have AI safety aligned frontier labs (including but not limited to Anthropic) because they will have a seat at the table with the regulators, and the regulators will take major industry players' opinions more seriously than minor players or activists?
i've heard this argument but i'm trying to figure out if it's common enough to be worth writing a post about
thoughts on lemborexant
pros: if you take it, you will fall asleep 30-60 minutes later. nothing else I've tried has been as reliable at making sure I definitely fall asleep, and as far as I can tell, it doesn't destroy my sleep quality. especially at 10mg, you can feel it knocking you out, and you basically can't power through it even if you want to. it's a bit scary but all powerful sleep drugs are at least a bit scary and often a lot more scary. I generally take 5mg instead.
cons: it doesn't do anything to keep you asleep; if your body doesn't really want to sleep, you will wake up 2 hours later fully alert. it also doesn't do anything to shift your sleep schedule. these facts combined mean that if you try to use lemborexant for jet lag / shifting sleep earlier, then your life will suck indefinitely until you stop using lemborexant. my current recipe is to only use lemborexant when it's near enough to my normal bedtime, and I use melatonin 3 hours before bed to slowly move sleep schedule earlier (later requires no special effort)
(potentially this also means lemborexant can be used to get nice 2 hour daytime naps? I have enough fear of god sleep drugs that I feel hesitant to try any kind of hack like this)
(not medical advice. not a doctor, and even if I was a doctor I'm not your doctor, and even if I was your doctor I wouldn't be communicating to you via lesswrong shortforms)
a simple elegant intuition for the relationship between SVD and eigendecomposition that I haven't heard before:
the eigendecomposition of A tells us which directions A stretches along without rotating. but sometimes we want to know all the directions things get stretched along, even if there is rotation.
why does taking the eigendecomposition of help us? suppose we rewrite , where S just scales (i.e is normal matrix), and R is just a rotation matrix. then, , and the R's cancel out because transpose of rotation matrix is also its inverse.
intuitively, imagine thinking of A as first scaling in place, and then rotating. then, ATA would first scale, then rotate, then rotate again in the opposite direction, then scale again. so all the rotations cancel out and the resulting eigenvalues of ATA are the squares of the scaling factors.
This is almost right, but a normal matrix is not a matrix that “just scales”, its a normal matrix which can do whatever linear operation it likes.
SVD tells us there exists a factorization where and are orthogonal, and is a “scaling matrix” in the sense that its diagonal. Therefore, using similar logic to you, which means we rotate, scale by the singular values twice, then rotate back, which is why the eigenvales of this are the squares of the singular values, and the eigenvectors are the right singular vectors.
philosophy: while the claims "good things are good" and "bad things are bad" at first appear to be compatible with each other, actually we can construct a weird hypothetical involving exact clones that demonstrates that they are fundamentally inconsistent with each other
law: could there be ambiguity in "don't do things that are bad as determined by a reasonable person, unless the thing is actually good?" well, unfortunately, there is no way to know until it actually happens
lifehack: buying 3 cheap pocket sized battery packs costs like $60 and basically eliminates the problem of running out of phone charge on the go. it's much easier to remember to charge them because you can instantaneously exchange your empty battery pack for a full one when you realize you need one, plugging the empty battery pack happens exactly when you swap for a fresh one, and even if you forget once or lose one you have some slack
One possible model of AI development is as follows: there exists some threshold beyond which capabilities are powerful enough to cause an x-risk, and such that we need alignment progress to be at the level needed to align that system before it comes into existence. I find it informative to think of this as a race where for capabilities the finish line is x-risk-capable AGI, and for alignment this is the ability to align x-risk-capable AGI. In this model, it is necessary but not sufficient for alignment for alignment to be ahead by the time it's at the finish line for good outcomes: if alignment doesn't make it there first, then we automatically lose, but even if it does, if alignment doesn't continue to improve proportional to capabilities, we might also fail at some later point. However, I think it's plausible we're not even on track for the necessary condition, so I'll focus on that within this post.
Given my distributions over how difficult AGI and alignment respectively are, and the amount of effort brought to bear on each of these problems, I think there's a worryingly large chance that we just won't have the alignment progress needed at the critical juncture.
I also think it's ...
on discovering new songs
the spotify recommender algorithm sucks. also, i often find i'm very unfamiliar with very well-known pieces of music. so i decided to do something weird. i used LMs to scrape several best songs lists from different online sources, merged them into one gigantic list, and used spotipy to create a spotify playlist of all of those random songs. whenever anyone recommends me a song, i also throw it into this giant playlist. then, when i want to explore new songs, i just put this playlist on. i have another script that automatically removes any songs i've put into my liked songs already, and i also manually remove songs i really don't like. this system has helped me discover dozens of new songs that i like.
is it worth writing blog posts about “obvious” things? i’ve been doing a lot of writing recently, and i frequently finish writing something, and i look at it, and i feel like it’s so obvious that all readers will either already agree and not learn anything, or disagree so fundamentally that changing their mind would require diving much deeper into fundamental beliefs.
theory: a big difference between people who hate corporations and people who don't is the extent to which they like interacting with human-shaped things. some people like human shaped things and the sort of amoral profit maximization of companies feels alien and sociopathic. other people like the predictable API that companies provide.
is it generally best to take just one med (e.g antidepressant, adhd, anxiolytic), or is it best to take a mix of many meds, each at a lesser dosage? my intuitions seem to suggest that the latter could be better. in particular, consider the following toy model: your brain has parameters that should be at some optimal , and your loss function is a quadratic around . each dimension in this space represents some aspect of how your brain is configured - they might for instance represent your level of alertness, or impulsivity, or risk averseness, or motivation, etc. each med is some vector that you can add to your current state , and the optimal dosage of that med in isolation is whichever quantity gets you closest to ; but unless happens to be exactly colinear with , you basically can't do any better just by tuning the dosage of the one med. this seems especially important because most meds don't seem to be exactly monosemantic, and also different people start out with substantially different and loss landscapes, such that you often get paradoxical reactions to meds.
I made a manifold market about how likely we are to get ambitious mechanistic interpretability to GPT-2 level: https://manifold.markets/LeoGao/will-we-fully-interpret-a-gpt2-leve?r=TGVvR2Fv
sure, you can notice extremely large effect sizes through vibes. but the claim is that for even "smaller" effect sizes (like, tens of percentage points, e.g 50->75%), you need pretty big sample sizes. obviously 0->100% doesn't need a very large sample size.
I agree that chatgpt obviously has lots of name recognition but I do also separately think chatgpt has less name recognition than you might guess. I predict that only 85% of Americans would get a multiple choice question right about what kind of app chatgpt is (choices: artificial intelligence; social media; messaging and calling; online dating). whereas a control question about e.g Google will get like 97% or whatever the lizardman constant dictates
idea: survey people about whether 3^^^3 toe stubbings can be worse than torture, except with a twist: with 50% probability, arrange the furniture in the room such that people actually accidentally stub their toe right before answering the survey
i find it disappointing that a lot of people believe things about trading that are obviously crazy even if you only believe in a very weak form of the EMH. for example, technical analysis is obviously tea leaf reading - if it were predictive whatsoever, you could make a lot of money by exploiting it until it is no longer predictive.
Close friend of mine, a regular software engineer, recently threw tens of thousands of dollars - a sizable chunk of his yearly salary - at futures contracts on some absurd theory about the Japanese Yen. Over the last few weeks, he coinflipped his money into half a million dollars. Everyone who knows him was begging him to pull out and use the money to buy a house or something. But of course yesterday he sold his futures contracts and bought into 0DTE Nasdaq options on another theory, and literally lost everything he put in and then some. I'm not sure but I think he's down about half his yearly salary overall.
He has been doing this kind of thing for the last two years or so - not just making investments, but making the most absurd, high risk investments you can think of. Every time he comes up with a new trade, he has a story for me about how his cousin/whatever who's a commodities trader recommended the trade to him, or about how a geopolitical event is gonna spike the stock of Lockheed Martin, or something. On many occasions I have attempted to explain some kind of Inadequate Equilibria thesis to him, but it just doesn't seem to "stick".
It's not that he "rejects" the EMH in these ...
i think it's quite valuable to go through your key beliefs and work through what the implications would be if they were false. this has several benefits:
economic recession and subsequent reduction in speculative research, including towards AGI, seems very plausible
AI (by which I mean, like, big neural networks and whatever) is not that economically useful right now. furthermore, current usage figures are likely an overestimate of true economic usefulness because a very large fraction of it is likely to be bubbly spending that will itself dry up if there is a recession (legacy companies putting LLMs into things to be cool, startups that are burning money without PMF, consumers with disposable income to spend on entertainment).
it will probably still be profitable to develop AI tech, but things will be much more tethered to consumer usefulness.
this probably doesn't set AGI back that much but I think people are heavily underrating this as a possibility. it also probably heavily impacts the amount of alignment work done at labs.
one man's modus tollens is another man's modus ponens:
"making progress without empirical feedback loops is really hard, so we should get feedback loops where possible" "in some cases (i.e close to x-risk), building feedback loops is not possible, so we need to figure out how to make progress without empirical feedback loops. this is (part of) why alignment is hard"
A common cycle:
Sometimes this even results in better models over time.
how much good has moral conviction done throughout history?
one extreme view you can have is everything good comes from moral conviction, and that without it everything would be moloch slop. the opposite view is a Randian view that everything good comes from practical incentives, and that moral convictions are at best futile and at worst actively harmful.
who has done the highest quality research on learning (and transfer learning in particular) in humans? specifically, i'm curious to answer questions like:
is it often observed that children like celebrating birthdays, aspiring to be older, and then when they reach a particular age, they realize the error of their ways and treat impending birthdays as a mark of getting closer to death. while it is generally assumed that this is because the evidence of how shitty aging is only becomes evident with age, there is also a mathematical explanation. each year, your expected remaining lifespan changes by some amount. for most of your life, this is close to -1 per year, because you almost certainly weren't about to su...
most history is done in a very humanitiespilled, academia flavored way. are there good examples of people doing very analytical, capital-intensive history research where the quality of the work is judged based on how successfully the resulting theories made good predictions/decisions?
publicly registering a bet with Gabor Hollbeck:
i predict that the median CS postdoc will be publishing less than 100 papers a year 5 years from now. Gabor predicts otherwise.
i wish more showers adopted the design where there is an on/off knob and a temperature knob. it's so obviously better. on the other hand, i hate the single knob showers
why are there no bicameral legislatures with one chamber apportioned by population, and one chamber apportioned by economic productivity?
I'm not sure if this is the answer you're looking for, but: most things that could exist don't. The space of ideas is wide, and few of them are implemented in practice. Is this idea particularly privileged in the space of possible governance ideas, in such a way where you would have expected it to have been tried?
What about costly signals? E.g. every year, each state chooses how much money they donate to the federal government. Their voting power in the second chamber is proportional to the size of their donation.
the concept of a spontaneous unscheduled phone call is so strange and alien to me. you're telling me there are people out there who want to be interrupted at random points in their day, and a large fraction of the time they are able to just pick up and talk? rather than constantly getting voicemails, and then leaving voicemails back because by the time you get around to replying, the caller is busy? do these people spend most of their days doing neither deep work nor being in social situations that would be rude to suddenly step away from?
I love spontaneous unscheduled phone calls!
You are telling me there are people out there who when they want to make progress on something that is blocked by another person, or where whenever some kind of thinking is best aided by another person, just... wait for hours or days at a time until they respond? Juggling 15-20 different messaging threads without getting any focused work done, instead of simply calling the person, resolving the issue and moving on? Do these people spend most of their days just waiting on other people to get back to them, or being in pre-scheduled calls all day that are scheduled for 30 minutes despite being resolvable in a 5-minute phone call?
sodium cotransport is really cool. while the gut can absorb glucose and sodium individually through several different pathways, there is a really important transporter (SGLT1) which carries glucose and sodium at the same time.
this is really important for rehydration. suppose you have cholera and vomit a lot and get super dehydrated as a result. drinking just water sucks, because you need to replenish the electrolytes that you're losing too. but water with salts is still not optimal, because it's absorbed less efficiently (also i think cholera interferes wi...
idk exactly how, they just pop up to my mind easily. maybe because i am very aware of the things i'm disappointed about not having done. also, i can consult my todo list, which is effectively a list of things i will never do because i don't have enough agency. like i'm going to set a timer for 10 minutes and write as many things as i can think of:
I'd be really excited if anyone wanted to look at training circuit sparse models on the AlgZoo tasks and seeing if we can push the frontier of understandability.
it would be funny if, in the future, the boot sequence for the dyson sphere supercomputer still starts out in 16-bit real mode. the world's most expensive 8086
obviously there's also a lot of consumer demand, but I wonder how much of the trend towards food with less complicated ingredients being marketed with that as a major pro is because it's more technically impressive to accomplish (my layman understanding is that the easy way to make viable commercial food is to just toss in a bunch of preservatives and emulsifiers and stabilizers and you have a lot of margin for error, and avoiding them requires a lot of creativity in leveraging the specific properties of the food you're dealing with / modifying the packaging strategy to create a more elegant solution)
everyone is a few hops away from everyone else. this applies in both directions: when I meet random people they always have some weak connection to other people I know, but also when I think of a collection of people as a cluster, most specific pairs of people within that cluster barely know each other except through other people in the cluster.
I am confused why you think my claims are only semi related. to me my claim is very straightforward, and the things i'm saying are straightforwardly converying a world model that seems to me to explain why i believe my claim. i'm trying to explain in good faith, not trying to say random things. i'm claiming a theory of how people parse information, to justify my opening statement, which i can clarify as:
the world is too big and confusing, so to get anything done (and to stay sane) you have to adopt a frame. each frame abstracts away a ton about the world, out of necessity. every frame is wrong, but some are useful. a frame comes with a set of beliefs about the world and a mechanism for updating those beliefs.
some frames contain within them the ability to become more correct without needing to discard the frame entirely; they are calibrated about and admit what they don't know. they change gradually as we learn more. other frames work empirically but are a...
for something to be a good way of learning, the following criteria have to be met:
trying to do the thing you care about directly hits 2 but can fail 1 and 3. many things that you can study hit 1 but fail 2 and 3. and of course, many fun games hit 3 (and sometimes 1) but fail to hit 2.
it's actually crazy how much ubering pareto dominates driving in a city like SF. you don't have to worry about parking, you can work while in transit, you can get a bigger car when needed, you don't need to round trip, etc. it's generally even cheaper once you take depreciation, parking, insurance, etc costs into consideration.
higher margins rightfully means higher market cap. if your company is barely scraping by, youre not producing as much value.
Right, I think the market caps are justified for the most part. But market caps represent the present value of expected future profits, not a measure of current economic activity.
higher margins rightfully means higher market cap. if your company is barely scraping by, youre not producing as much value.
Not much surplus; you can still be a commodity around which huge volumes of production and consumption revolve even if your prospects f...
Perhaps. I think writing things into contracts is a great way to make sure that they happen, and if the counterparty is unwilling to sign them into contracts, then this is a strong sign that you won't be able to make it happen later. It would have significantly increased the adversarial relationship between Anthropic and the USG for them to politely remove it from the contract and then work hard internally to make sure that it never got used that way. Maybe it would've been worth it, but I'm not convinced.
day 1 of using a new phone: there cannot be a single small bubble under my screen protector. it must be perfect.
day 1000 of using the phone: the square inch sized chunk of dead pixels on the screen is fine because it doesn't usually cover anything important, and I can still read words in between the cracks
conference talks aren't worth going to irl because they're recorded anyways. ofc, you're not actually going to remember to watch the recording, but it's not like anyone pays attention at the irl talk anyways
a thriving culture is a mark of a healthy and intellectually productive community / information ecosystem. it's really hard to fake this. when people try, it usually comes off weird. for example, when people try to forcibly create internal company culture, it often comes off as very cringe.
there are two different modes of learning i've noticed.
often the easiest way to gain status within some system is to achieve things outside that system
Corollary to Others are wrong != I am right (https://www.lesswrong.com/posts/4QemtxDFaGXyGSrGD/other-people-are-wrong-vs-i-am-right): It is far easier to convince me that I'm wrong than to convince me that you're right.
has anyone done a good analysis of how to reduce fatality and injury risk of driving over a baseline of normal Uber? in particular, how much would each of the following matter:
also in particular interested in time-of-day segmented stats. several factors make this difficult. time of day accident data is confounded by intoxication and fatigue; but there is some bleed over, because someone else crashing into you is a large fraction as...
the nuclear option
in the Senate of the US Congress, there is a "nuclear option" for overriding filibusters; a parliamentary method that can be used to ram legislation through. both sides generally agree to use it sparingly, because it's a symmetric weapon.
i think there is a similar lever in arguments that is best left untouched if your goal is to actually find the truth. there is a type of argument that can be deployed in a wide range of circumstances, and is very hard to rebut except with even more nuclear arguments. the most extreme example is, suppose y...
Tesla is bigger by market cap but if you look at metrics like revenue and the amount of cars sold it's much smaller. Tesla earns it market cap by the hope that it's technology will be more significant in the future when it has fully-driverless cars.
tbh, i don't really understand the concept of themes/symbolism in fiction books. aside from the most literal things. how much of this is just people being pretentious and/or reading tea leaves?
one problem with UBI as a solution for AI economic disruption: at the moment when AI can first replace a human job, it will probably cost only epsilon less than the human. the cost will be mostly capital (datacenters, chips, electric plants, etc), rather than labor. so we can only afford to give the human epsilon UBI. as time goes on, eventually the AI gets cheap enough that humans can get substantial UBI, possibly exceeding their original income, as the AIs become more productive than the humans were. but there's a big gap in the middle that we need to br...
This can't be right. The troublesome point you describe happens when there are already enough "AI workers" to displace all current jobs, but the extra productivity is still only epsilon (why?) and the number of "AI workers" isn't growing explosively far beyond that (why?)
Anyway, the real problem isn't that capital owners won't have enough money to pay us UBI. It's that that they... won't pay us UBI. Simple as that.
can someone who works in quant/HFT/market making help me understand whether the following is correct?
(assuming there is only a single exchange for simplicity,) order execution is hard because (a) the order book is of finite size, so placing a large order induces slippage, and (b) if you make a series of trades spaced far apart to wait for more liquidity to show up, the value of the stock can move, (c) if you make a series of trades predictably, then HFTs can fuck you over by clearing out the order book 1ms before you buy and turning around to sell to you.
s...
I am not a quant, but have some related background. (Those who know this area best, may not be inclined to say.)
"Real traders" have many ways to avoid getting front-run to the extreme degree suggested in (c), including limit orders and "trying not to be that predictable" by disguising action to look like other forms of flow.
The amount of pain you experience from (b) depends on whether you think your strategy's value decays rapidly or slowly.
But there is is a more general problem: it is not just HFT's but the market as a whole that reacts to your actions: your impact will shift the demand curve for the stock. the size of that impact depends on the information leaked by your actions, information leaked by passage of time, and time allowed for new liquidity to arrive.
there is academic work on theoretical "square-root laws of market impact"
https://mfe.baruch.cuny.edu/wp-content/uploads/2012/09/Chicago2016OptimalExecution.pdf
but predicting actual impact is hard for a number of reasons (limited data, causality issues)
Knowledge of what other players can and can't infer from your execution, and modeling impact patterns well, is a multiplier on the value of strategies, hence worth spending a lot to get right.
i really wish there were a better platform for repeatable cognitive testing than brainlabs.me. the website feels like it is about to fall over from a light breeze, and i would be very sad if i suddenly lost my method of measurement because the site disappeared. also, there doesn't seem to be particularly strong evidence that these tests in particular are the right ones to be looking at.
I've always been relatively unfamiliar with normal pop culture, so I recently decided to look at several online lists of best/most recognizable songs and made a spotify playlist of several hundred of them, with a bias towards more recent songs. I think this has been much better than the Spotify recommendation algorithm, which mostly shows me songs similar to ones I've already listened to.
are the Cambridge Brain Science cognitive tests actually reliable and relatively immune to practice effects? I want to have some mostly repeatable measurement of my own cognitive abilities over time, for health tracking reasons, but it's unclear to me how reliable it is
fun side project idea: create a matrix X and accompanying QR decomposition, such that X and Q are both valid QR codes that link to the wikipedia page about QR decomposition
I think the most important part of paying for goods and services is often not the raw time saved, but the cognitive overhead avoided. for instance, I'd pay much more to avoid having to spend 15 minutes understanding something complicated (assuming there is no learning value) than 15 minutes waiting. so it's plausibly more costly to have to figure out the timetable, fare system, remembering to transfer, navigating the station, than the additional time spent in transit (especially applicable in a new unfamiliar city)
current understanding of optimization
Some aspirational personal epistemic rules for keeping discussions as truth seeking as possible (not at all novel whatsoever, I'm sure there exist 5 posts on every single one of these points that are more eloquent)
I think I'd be confused. Do they care about more or better paperclips, or do they care about worship of paperclips by thinking beings? Why would they care whether I say I would do anything for paperclips, when I'm not actually making paperclips (or disassembling myself to become paperclips)?
not a lot of people (maybe literally 0) have had sufficient reason to saw off their own hand for altruistic reasons. i've donated a kidney, donate blood often, and gave more than the GWWC pledge when my income was high. any falsifiable claims you'd like to check while we're speculating about my values?
my point is that % of caring is not a coherent concept, or at least not the one that maps onto the intuitive notion of what % of your wealth you should donate.
specifically, suppose instead of there being 1e10 people, there were 1e100 people. i claim the % of your money you should donate should basically not change at all, even though the % of caring assigned to yourself has plummeted by a huge amount
A recent anecdote: I just created my own script that's close to roman alphabet structurally with a few modifications.
Learning it, memorizing the parts and some rules takes ~1 hour.
This makes me think that the hard think about learning reading/writing is the initial visualsymbol-language mapping. Not just getting the symbols and rules into your head.
lingao qiming is the hardest scifi I've ever read. it puts other "hard" scifi like project hail mary or three body problem to shame. the basic conceit of the book is that it's an isekai where some people discover a wormhole to a parallel universe exactly like ours but during the time of ming dynasty china, and decide to being 500 technical specialists and a bunch of modern supplies to the past to try and conquer ancient china. the vast majority of the book is devoted to discussing every single technical aspect in excruciating well-researched detail. you do...
Comment with practically 0 infromational value (due to total absence of context) but 37 12 karma/agreement feels like "twitter" in the bad sence of this concept, not LW. Which is very sad for me as an old reader. You probably mean something related to american politics, but I suppose many users are not american and dont even have much knowledge about this things. Maybe you mean something totally different. Maybe OpenAI and antrophic drama? I cant even make sense from this.
publicly registering a bet with Emmett Bicker: I predict that on February 16, 2027, there will be at least 3 people at one of openai/anthropic/GDM who work on kernels full time (or, a larger number of people spending part of their time working on kernels, such that the time spent adds up to 3 FTEs, capped at 50 people maximum). if all of these companies have gone bankrupt or pivoted heavily or cut their workforce substantially because of a market crash or AI winter, this resolves in my favor. if AI kills everyone or creates the glorious posthuman utopia before then, it resolves in Emmett's favor (regardless of whether there are still people who do kernels work for fun).
I honestly didn't think of that at all when making the market, because I think takeover-capability-level AGI by 2028 is extremely unlikely.
I care about this market insofar as it tells us whether (people believe) this is a good research direction. So obviously it's perfectly ok to resolve YES if it is solved and a lot of the work was done by AI assistants. If AI fooms and murders everyone before 2028 then this is obviously a bad portent for this research agenda, because it means we didn't get it done soon enough, and it's little comfort if the ASI sol...
ethical offsets for eating meat are difficult because it's hard to quantify the expected impact of e.g donating to an animal rights charity, and compare it to the impact of eating meat. (if you pay for a lobbyist to talk to a congressman for 30 minutes about larger cages for chicken farming, how much does this improve chicken lives, and how many chicken lives saved is that equivalent to?)
here's a much simpler solution: almost everyone agrees that a human is more morally valuable than a cow, even if the human is far away in a distant land. (the cow is also ...
I just meant that if an oracle told me ASI was coming in two years, I probably couldn't spend down energy reserves to get more done within that timeframe compared to being told it'll take ten years. I might feel a greater sense of urgency than I already am and perhaps end up working longer hours as a result of that, but if so that'd probably be an unendorsed emotional response I couldn't help more than a considered plan. I kind of doubt I'd actually get more done that way. Some slack for curiosity and play is required for me to do my job well.
The stakes are already so high and time so short that varying either within an order of magnitude up or down really doesn't change things all that much.
Yes, I think frontier AI companies are responsible for most of the algorithmic progress. I think its unclear how much the leading actor benefits from progress done at other slightly behind AI companies and this could make progress substantially slower. (However, it's possible the leading AI company would be able to acquire the GPUs from these other companies.)
hypothesis: the kind of reasoning that causes ML people to say "we have made no progress towards AGI whatsoever" is closely analogous to the kind of reasoning that makes alignment people say "we have made no progress towards hard alignment whatsoever"
ML people see stuff like GPT4 and correctly notice that it's in fact kind of dumb and bad at generalization in the same ways that ML always has been. they make an incorrect extrapolation, which is that AGI must therefore be 100 years away, rather than 10 years away
high p(doom) alignment people see current mode...
it is often claimed that merely passively absorbing information is not sufficient for learning, but rather some amount of intentional learning is needed. I think this is true in general. however, one interesting benefit of passively absorbing information is that you notice some concepts/terms/areas come up more often than others. this is useful because there's simply too much stuff out there to learn, and some knowledge is a lot more useful than other knowledge. noticing which kinds of things come up often is therefore useful for prioritization. I often notice that my motivational system really likes to use this heuristic for deciding how motivated to be while learning something.
Understanding how an abstraction works under the hood is useful because it gives you intuitions for when it's likely to leak and what to do in those cases.
takes on takeoff (or: Why Aren't The Models Mesaoptimizer-y Yet)
here are some reasons we might care about discontinuities:
The following things are not the same:
In the spirit of https://www.lesswrong.com/posts/fFY2HeC9i2Tx8FEnK/my-resentful-story-of-becoming-a-medical-miracle , some anecdotes about things I have tried, in the hopes that I can be someone else's "one guy on a message board. None of this is medical advice, etc.
a corollary to the hazards of arguing against bad takes: please don't write things that are defined entirely by trying to avoid the reader coming away with specific bad takes or misunderstandings people often have.
you should write things primarily to nail down the concepts unambiguously for an audience of generic smart people. your idea should be defined by what it is, and not what it is not. it isn't SCP-055.
if you really need to, add a "things i don't mean" section to concretely describe and disavow some common misunderstandings. but it should be possibl...
i often think specific capabilities projects are quite unlikely to work, and therefore not worth taking into account when coming up with my alignment approach, and also simultaneously that my alignment project is quite unlikely to work, but it's worth trying in case it does work. why this asymmetry?
i claim this is rational. the key is that upside risk and downside risk should be treated different. if i think an alignment approach has a 1% chance of working, it might still be worth spending my life on. but if i think there’s a 1% chance some capabilities te...
i was recently in an Uber and the driver started talking to me about Musk v OpenAI (almost completely unprompted! i had only mentioned that i do computer stuff.)
sometimes, my answer to a question flips multiple times as you move along the axis from literal answer to spiritually-accurate answer. unfortunately hard to share specific examples for privacy reasons
higher margins rightfully means higher market cap. if your company is barely scraping by, youre not producing as much value.
This should be capturing rather than producing. (Arguably Meta produces negative value).
i predict that 10 years from today, i will be able to find a pound of ground beef in some supermarket in Boston for less than 10 USD (adjusted for inflation to dollars today). JC Tidefield predicts otherwise
OpenAI researcher confirmed to have "long timelines". Still expects supermarkets and ground beef to be a thing.
scifi setting idea: movement from rural areas and small cities to larger cities continues until approximately everyone lives in one of like 10 different megacities; all of the farmland and oil fields and mines and whatnot in between are 99% roboticized, with only occasional human repairs; all of the cities are tightly connected by supersonic travel, which becomes more feasible because there are very few people on the ground outside cities to get annoyed by the noise; drugs solve sleep and allow effortless adaptation to jet lag. uniquely, SF nether expands ...
what is the current best scientific understanding of how bad ozone redistribution (less ozone in upper stratosphere, but more in lower stratosphere, with same overall amount) is compared to ozone disappearing entirely?
i mean most companies won't eat you alive? you can form a bond with the coca cola company in the same one directional way as the lawnmower and it's not like they will take advantage of that to extract every dollar from you. in fact basically only like Facebook, tiktok, etc are like that, and even then they're not that bad; they're no worse than an abusive human partner
My idea was, maybe the AI company is willing to sell you 1 unit of AI labor at human-competitive price, but if you order 1000 units they'll ask for a higher price per unit, because they need to build more datacenters or something. In this case replacement of humans will be gradual even if all humans are equally productive. And another possibility is that humans aren't all equally productive, so AI will first get good enough to replace the worst worker, then the second worst and so on. From these two reasons I get the possibility that by the time lots of pe...
TIL that it's highly nontrivial to figure out which direction true north is given magnetic north and your location on earth.
I had always assumed that you could treat the earth as a big magnet with the magnetic north pole in a slightly different place than true geographical north. but apparently the magnetic field of the earth is a really weird fucked up shape.
it's pretty elegant that shapley values assign 1/population of the credit to each individual voter in an election.
the difference between activation sparsity, circuit sparsity, and weight sparsity
activation sparsity enforces that features activate sparsely - every feature activates only occasionally.
circuit sparsity enforces that the connections between features is sparse - most features are not connected to most other features.
weight sparsity enforces that most of the weights are zero. weight sparsity naturally implies circuit sparsity if we interpret the neurons and residual channels of the resulting model as the features.
weight sparsity is not the only way to ...
only a fool is easily parted from his money. but even the most wise, intelligent, and savvy are routinely parted from his power and influence
I wonder how many supposedly consistently successful retail traders are actually just picking up pennies in front of the steamroller, and would eventually lose it all if they kept at it long enough.
also I wonder how many people have runs of very good performance interspersed by big losses, such that the overall net gains are relatively modest, but psychologically they only remember/recount the runs of good performance, whereas the losses were just bad luck and will be avoided next time.
My own expectation is that limitations result in creativity. Writers block is usually a result of having too many possibilities/choices. If I tell you "You can write a story about anything", it's likely harder for you to think of anything than if I tell you "Write a story about an orange cat". In the latter situation, you're more limited, but you also have something to work with.
I'm not sure if it's as true for computers as it is for humans (that would imply information-theoretic factors), but there's plenty of factors in humans, like analysis paralysis and the "See also" section of that page
for a sufficiently competent policy, the fact that BoN doesn't update the policy doesn't mean it leaks any fewer bits of info to the policy than normal RL
aiming directly for achieving some goal is not always the most effective way of achieving that goal.
people love to find patterns in things. sometimes this manifests as mysticism- trying to find patterns where they don't exist, insisting that things are not coincidences when they totally just are. i think a weaker version of this kind of thinking shows up a lot in e.g literature too- events occur not because of the bubbling randomness of reality, but rather carry symbolic significance for the plot. things don't just randomly happen without deeper meaning.
some people are much more likely to think in this way than others. rationalists are very far along the...
more importantly, both i and the other person get more out of the conversation. almost always, there are subtle misunderstandings and the rest of the conversation would otherwise involve a lot of talking past each other. you can only really make progress when you're actually engaging with the other person's true beliefs, rather than a misunderstanding of their beliefs.
One of the greatest tragedies of truth-seeking as a human is that the things we instinctively do when someone else is wrong are often the exact opposite of the thing that would actually convince the other person.
here's a straw hypothetical example where I've exaggerated both 1 and 2; the details aren't exactly correct but the vibe is more important:
1: "Here's a super clever extension of debate that mitigates obfuscated arguments [etc], this should just solve alignment"
2: "Debate works if you can actually set the goals of the agents (i.e you've solved inner alignment), but otherwise you can get issues with the agents coordinating [etc]"
1: "Well the goals have to be inside the NN somewhere so we can probably just do something with interpretability or whatever"
2: "ho...
a claim I've been saying irl for a while but have never gotten around to writing up: current LLMs are benign not because of the language modelling objective, but because of the generalization properties of current NNs (or to be more precise, the lack thereof). with better generalization LLMs are dangerous too. we can also notice that RL policies are benign in the same ways, which should not be the case if the objective was the core reason. one thing that can go wrong with this assumption is thinking about LLMs that are both extremely good at generalizing ...
Schmidhubering the agentic LLM stuff pretty hard https://leogao.dev/2020/08/17/Building-AGI-Using-Language-Models/
scifi story idea: a post-upload world where we’ve discovered that the human brain actually consists of multiple independent conscious entities that merely have the illusion of being a single individual because they are physically colocated; and so in the glorious upload utopia, the fundamental unit of society is not individual humans, but rather their parts. humans become a multi-unit legal entity in the same way that families or married couples or corporations are multi-unit legal entities today; each part has rights and the ability to secede from the res...
Do these expectations take selection effects into account? I'm also thinking longer term than 60 years. A Malthusian equilibrium is the natural state for a population of organisms to be in. We're currently out of equilibrium, but the obvious expectation is that we will at some point settle back into a Malthusian equilibrium unless we somehow choose not to or otherwise go extinct.
https://archive.org/details/willsovietunions0000andr someone correctly predicted the collapse of the soviet union in 1970 (though he was off by 7 years)
I don't get it.
You have some pool of caring you are willing to donate, then in the case of where all other humans need a donation, they will each receive pool/total_pop. Then you care about each of them as pool/total_pop.
Like, if one encounters an opportunity to donate to a single stranger who needs it, people go above that pool/total_pop, but it doesn't mean they would give more than total of pool in previous case. The scaling is weird.
Your previous statements are unclear.
lots of things are very hard. making models do IMO problems is very hard, for example.
i guess there are two main questions. one is, why would we expect a method that makes LMs adversarially robust to also work on AGI? and second, even supposing we can know the technique to generalize to AGI, why would we expect the ability to adversarially robustify a reward model to help make an inner-misaligned model pursue the right goal?
why would it be a large advance in our alignment abilities? i don't see any reason why making gpt-5 refuse bioweapons reliably would be at all mechanistically analogous to aligning AGI
hypothesis: the wrong reason to read books is to feel a need to read books because you're supposed to have read them as an educated person, or as some kind of weird status thing of being part of the ingroup, or a general need to feel well read and worldly. the right reason is to feel a burning passion to find a specific piece of knowledge that will finally answer a question you are curious about that happens to be locked inside a specific book, or a gnawing pain in your heart that can only be quelled by knowing that it's a universal problem that someone out there across space and time understands and has fixed in themselves.
I don't really think there are right or wrong reasons to read books, just like there aren't right or wrong reasons to exercise. The benefits will accrue either way. Consider book clubs as analogous to running clubs in producing social pressure to keep reading.
i think SAEs are a completely reasonable thing under the first worldview, and mostly crazy under the second worldview (with the exception of maybe bio or something where I've heard they're genuinely useful)
(SAEs are not sufficient to actually understand things, but they are a genuine step on the way there)
people say London is declining. but walking around, i see construction everywhere, and many new skyscrapers that i don't remember seeing last time i visited 4 years ago.
if there was a guy who stood there swinging a scythe to cut grass and didn't seem to care or feel bad or really respond at all to accidentally cutting someone's arm off, we'd consider them uncaring and sociopathic. similarly, if we think of, say, an insurance company as a person, then when it declines someone's claim and leaves them destitute, it's reasonable to think of that person as uncaring and sociopathic. you can argue all you want about the economics of how insurance can only work if you do this but for the individual people who interface with this,...
On the one hand, yeah. On the other hand, the rest of the story (AFAICT based on your description) isn't really that sci-fi, let alone "hard", except insofar as it's set up by the time travel. You could just as well write a story about the Spaniards ultra-strategizing about efficiently conquering the Mexica or the Inca.
That's unfairly dismissive. I can't speak to retail, but manufacturing absolutely does require "deep work". Machining requires concentration and technique in order to ensure parts have the right tolerances, surface finish, etc. Assembly work often involves deep thinking in order to ensure that the machine is correctly assembled and properly configured.
It's not all routine "mind numbing" assembly line work, just as not all IT is routine mind numbing data entry.
Naive use of limit orders will cause you to lose the profitable trades, and fill the unprofitable ones. There are ways around this, but it's not trivial.
a funny but relatable response i got in the free response "other" box for a survey about things people are most worried about with AI:
How do you hide from a robot that's more intelligent than humans and can see through walls etc? You can't hide.
Yeah, I do feel confused about the extent to which the solution to this problem is just "selectively become dumber" (e.g. as discussed by Habryka here). However, I have faith that there are a bunch of Pareto improvements to be made—for example, I think that less neuroticism helps you get less pwned without making you dumber in general. (Though as a counterpoint, maybe neuroticism was useful for helping people identify AI risk?) I'd like to figure out theories of virtue and emotional health good enough to allow us to robustly identify other such Pareto impr...
aside from the luxury aspect, there are two major practical reasons you'd want to fly charter instead of commercial. one is flexibility spatially (you want to fly from some small city to some other small city without doing two layovers), and the other is flexibility temporally (you want to fly at an odd hour).
there are a bunch of airlines that tackle the spatial problem. why aren't there many that tackle the temporal problem? there does't seem to exist airlines that fly very small jets very frequently or at odd hours of the day for major routes.
to fly one ...
I don't think unsycophantic kindness is quite that difficult to achieve. clearly some groups of people IRL achieve such kindness. generally, people in such communities try to understand each other and why they believe the things they do without judgement in either direction, and affirm the emotional responses to beliefs rather than the beliefs themselves. you don't have to agree with someone to agree that you'd feel the same in their shoes. somehow, these groups of people don't inevitably slide into subtle sneering and trolling and sycophancy.
plus, the poi...
if anything, it seems more common that people dig into incorrect beliefs because of a sense of adversity against others
Consider cults (including milder things like weird "alternative" health advice groups etc.). Positivity and mutual support seem like a key element of their architecture, and adversity often primarily comes from peers rather than an outgroup. I'm not talking about isolated beliefs, content and motivations for those tend to be far more legible. A lot of belief memeplexes have either too few followers or aren't distinct enough from all the other nonsense to be explicitly labeled as cults or ideologies, or to be organized, but you generally can't argue their members out of alignment with the group (on relevant beliefs, considered altogether).
the point ... is to make it clear that when you are receiving kindness, you are not receiving updates towards truth
This is also a standard piece of anti-epistemic machinery of groups that reinforce some nonsense memplex among themselves with support and positivity. Support and positivity are great, but directing them to systematically taboo correctness-fixing activity is what I'm gesturing at, the sort of "kindness" that by its intent and nature tends to trade off against correctness.
I'm sufficiently extroverted that if the social interaction goes well, it gives me more than enough psychological energy to pay for multiple additional social bids. obviously, this is separate from physiological energy; if I'm sleep deprived and physically exhausted, this is insufficient. but I don't generally get that physically exhausted from social interaction, unless I'm at neurips or something.
my mental model of how a pop triggers a broader crash is something like: a lot of people are taking money and investing it into AI stuff, directly (by investing in openai, nvidia, tsmc, etc) or indirectly (by investing in literally anything else; like, cement companies that make a lot of money by selling cement to build datacenters or whatever). this includes VCs, sovereign wealth funds, banks, etc. if it suddenly turned out that the datacenters and IP were worth a lot less than they thought it was, their equity (or debt) ownership is suddenly worth a lot less than they thought it was, and they may become insolvent. and lots of financial institutions becoming insolvent is pretty bad.
i would not only pay a lot per flight for good wifi, i would also fly way more often
I'm not sure how common this preference is.
I think that the economic gains from people traveling on business having access to better wifi on planes might be quite large[1], but airlines themselves are not well-positioned to capture very much of those gains. There are a very small number of domestic airlines which don't offer any wifi on their planes at all. The rest generally offer it for free, or for some relatively low price (on the order of $10). Often ...
I'm generally a very forgetful person. I forget people's names, my keys, my luggage, 2fa codes I saw 3 seconds ago, etc all the time. but for some reason I've never forgotten my hotel room number and needed to consult the written down number. this is weird because it's an arbitrary number that I'm given once and have to remember for a few days.
I used to think autism-to-autism communication was a thing; that is, autistic people get along best with other autistic people. I now think this model is partly true but also deeply flawed: in particular, there are many different types of autistic person, and not only do all types not get along with all types, it's not necessarily even true that people of the same type get along with each other (this is probably correlated with degree of self-love/acceptance or something). if anything, it's probably often even quite disconcerting and cognitive dissonance i...
(I don't know if the book is good but my knee jerk reaction to fitting sigmoids to things is it's a bit spooky - see https://arxiv.org/abs/2109.08065)
the LLM cost should not be too bad. it would mostly be looking at vague vibes rather than requiring lots of reasoning about the thing. I trust e.g AI summaries vastly less because they can require actual intelligence.
I'm happy to fund this a moderate amount for the MVP. I think it would be cool if this existed.
I don't really want to deal with all the problems that come with modifying something that already works for other people, at least not before we're confident the ideas are good. this points towards building a new thing. fwiw I think if building a new...
But not in full generality! This is a fine question to raise in this context, but in general the correct thing to do in basically all situations is to consider the object level, and then also let yourself notice if people are unusually insane around a subject, or insane for a particular reason. Sometimes that is the decisive factor, but for all questions, the best first pass is to think about how that part of the world works, rather than to think about the other monkeys who have talked about it in the past.
made an estimate of the distribution of prices of the SPX in one year by looking at SPX options prices, smoothing the implied volatilities and using Breeden-Litzenberger.
(not financial advice etc, just a fun side project)
The Duke of Wellington said that Napoleon's presence on a battlefield “was worth forty thousand men”.
This would be about 4% of France's military size in 1812.
twitter is great because it boils down saying funny things to purely a problem of optimizing for funniness, and letting twitter handle the logistics of discovery and distribution. being e.g a comedian is a lot more work.
the financial industry is a machine that lets you transmute a dollar into a reliable stream of ~4 cents a year ~forever (or vice versa). also, it gives you a risk knob you can turn that increases the expected value of the stream, but also the variance (or vice versa; you can take your risky stream and pay the financial industry to convert it into a reliable stream or lump sum)
in a highly competitive domain, it is often better and easier to be sui generis, rather than a top 10 percentile member of a large reference class
It is unfortunately impossible for me to know exactly what happened during this interaction. I will say that the specific tone you use matters a huge amount - for example, if you ask to understand why someone is upset about your actions, the exact same words will be much better received if you do it in a tone of contrition and wanting to improve, and it will be received very poorly if you do it in a tone that implies the other person is being unreasonable in being upset. From the very limited information I have, my guess is you probably often say things in a tone that's not interpreted the way you intended.
it ought to be possible to gift value to end customers rather than requiring the richest to be the ones who get the benefit, how can that be achieved?
The simple mechanism is:
Of course, you could make the UBI be to (e.g.) Taylor Swift fans in particular, but this is hardly a principled approach to redistribution.
Separately, musicians (and other performers) might want to subsidize tickets for extremely hard core fan...
prediction markets have two major issues for this use case. one is that prediction markets can only tell you whether people have been calibrated in the past, which is useful signal and filters out pundits but isn't very highly reliable for out of distribution questions (for example, ai x-risk). the other is that they don't really help much with the case where all the necessary information is already available but it is unclear what conclusion to draw from the evidence (and where having the right deliberative process to make sure the truth comes out at the end is the cat-belling problem). prediction markets can only "pull information from the future" so to speak.
an interesting fact that I notice is that in domains where there are are a lot of objects in consideration, those objects have some structure so that they can be classified, and how often those objects occur follows a power law or something, there are two very different frames that get used to think about that domain:
House rules for definitional disputes:
A few axes along which to classify optimizers:
Some observations: it feels l...
A thought pattern that I've noticed myself and others falling into sometimes: Sometimes I will make arguments about things from first principles that look something like "I don't see any way X can be true, it clearly follows from [premises] that X is definitely false", even though there are people who believe X is true. When this happens, it's almost always unproductive to continue to argue on first principles, but rather I should do one of: a) try to better understand the argument and find a more specific crux to disagree on or b) decide that this topic isn't worth investing more time in, register it as "not sure if X is true" in my mind, and move on.
well, no, i think alignment will probably get solved by someone who has delusional confidence that they have the one true approach, who maybe on some intellectual level knows that it's 1% likely to succeed but feel irrationally driven to make it work, embedded inside a system that is not delusional and able to assess alignment approaches rationally. most such people will be wrong in their delusion. perhaps the world will be destroyed by people with delusional confidence embedded inside systems that are also delusional. but it's very hard to truly devote yo...
theory: most people fall into one of the following categories (or some mix of them):
what's the best argument for why we should take Rawl's veil of ignorance seriously? it seems there are a wide range of possible theories you could have of consciousness/individualism, and they are basically unfalsifiable.
This is maybe offtopic to the thread, but I think the impression of language proficiency depends a lot on accent, and adults learning a foreign language don't spend nearly enough time on accent. A few weeks of watching youtube videos in the target language, trying to imitate the sounds exactly right, is a small effort which will yield amazing results at any age. But for some reason adults don't do it.
You can always use xcancel.com as a mirror for X: https://xcancel.com/hendrycks/status/2052422910133104670
https://x.com/hendrycks/status/2052422910133104670?s=20
hendrycks doubles down on the claim in this thread
i don't think you understood my argument. i didn't say you assign each of them 1/1000th of your total caring. i said you should assign each of them 1/1000th as much caring as you assign yourself. so you should occupy 1000/7 billion of your caring, and Bob from Randomland occupies 1/7 billion of your caring.
the entire point of my argument is it actually doesn't matter what % of your own caring you take up. that's not the relevant thing. the relevant thing is how much you care about each stranger relative to yourself, and the shape of your money utility curve.
Put another way: by having gravitational advantage, it's able to exert power over people on earth while being highly resistant to people on earth exerting power over it.
It is a castle at the top of the largest hill, which means it can rain down whatever it likes on the people below while not worrying that they will climb up to put a stop to it.
brainstorming thread: which people in history had the largest positive (counterfactual) impact on the world, by their own values (CEV, if they could see the consequences in hindsight)?
I had to Google "LTV". I believe it means the Labour Theory of Value, that the work put in to create something is a measure of that thing's value. Seems absurd to me. Is there anyone here who believes in it? Or elsewhere, even?
I don't know how many people explicitly believe it but there is a general worldview that inherently assumes it. There are common memes that use this to show the unfairness of pay disparity, such as this one. Inherently it assumes the only fair way for one person to be paid 351x more than another is if they work 351x harder - LTV. https://www.reddit.com/r/antiwork/comments/yrdbyg/ceos_are_not_worth_351_times_the_average_worker/
this would be very helpful! if someone has already done a high quality version of this experiment then i don’t need to do another one
i’m somewhat concerned that capsules are not super airtight, and the powder inside is also permeable to air.
Anthropic negotiated a great deal and gave up the practical limits that are relevant to the military about using their models for cyber attacks, censorship and disinformation campaigns in the process.
Bicameral systems can respect one person one vote, and they even do in the 49 states with bicameral legislatures.
The US Senate is a weird kludge that was necessary to secure support for the constitution. No one would arrive at it from first principles.
The best argument you can make for the Senate is that it's necessary to protect vulnerable minorities (residents of small states). And democracies tend to trade off in various ways to protect vulnerable minorities at the expense of democratic purity (e.g., via constitutional provisions). That's a pretty sill...
the three-class franchise seems like one reasonable implementation of this type of policy within a unitary state. within a US-like federation, you could have the house continue to function as it does right now, and have a senate where the number of senators is proportional to the tax revenue contributed by each state.
The old guard of Berkeley will sometimes wax poetically about Spoonrocket. It was indeed glorious.
https://sashachapin.substack.com/i/173991096/4-connection-is-about-dancing-to-the-music
is there actually strong evidence that doing meditation improves your ability to do social modelling?
Minmaxing trap is not happening. I am only allowed to do one edit per finished session, and that edit can be just an increment by 1 or a decrement by 1 of some parameter in the generator, which takes ~15 seconds. If my priorities change, the generator will eventually converge (easy-in over couple days) through the increments to the new state. That prevents "being hyped" and placing "all in" into some new exciting project. The new project will gain weight only if it keeps looking worthy.
I may adjust at the end of a session if i feel that something should h...
This comment seems to imply Nisan missed something, but normal rye sourdough bread without any preservatives easily lasts (edit: should have said "can easily last under the right circumstances") 7 days before going stale. Of course people can mean different things by "real old fashioned bread" but afaik sourdough bread was the standard method for most of human history.
visiting LA for the first time. I used to think I'd hate it, given my dislike of car centricness and low density. but I have to say, there's something about the sheer audacity of designing a city this way that makes it surprisingly kind of aesthetic.
it is definitely not a problem with current devices. my phone has gotten quite wet hundreds of times and still works perfectly fine. note that this is different from survivability fully submerged; my guess is your phone could probably survive being submerged for a few minutes in a pool or something but if you left it there for a day it would be dead.
Anything that goes onto airplanes is CERTIFIED TO SHIT. That's a big part of the reason why.
Another part is that it's clearly B2B, and anything B2B is an adversarial shark pit where each company is trying to take a bite out of each other while avoiding getting a bite taken out of them.
Between those two, it'll take a good while for quality Wi-Fi to proliferate, even though we 100% have the tech now.
dunno! some speculation:
I mean, even in the Felix Longoria Arlington case, which is what I assume you're referring to, it seems really hard for his staff members to have known, without the benefit of hindsight, that this was any significant window into his true beliefs? I mean, johnson is famously good at working himself up into appearing to genuinely believe whatever is politically convenient at the moment, and he briefly miscalculated the costs of supporting civil rights in this case. his apparent genuineness in this case doesn't seem like strong evidence.
Wouldn't this just lead to an equilibrium where every state has an about equal population super quickly though?
there's an exogenous factor, which is that the entire country was shifting leftward during the 50s and 60s. it's plausible that the 1964 bill would have passed anyways without the 1957 bill, possibly even earlier
Fixed the comment, thanks!
(Here it is otherwise:) https://pmc.ncbi.nlm.nih.gov/articles/PMC5390700/
I think it is good to use your goals as a general motivation for going approximately in some direction, but the opposite extreme of obsessing whether every single detail you learn contributes to the goal is premature optimization.
It reminds me of companies where, before you are allowed to spend 1 hour doing something, the entire team first needs to spend 10 hours in various meetings to determine whether that 1 hour would be spent optimally. I would rather spend all that time doing things, even if some of them turn out to be ultimately useless.
Sometimes it's not even obvious in advance which knowledge will turn out to be useful.
there are policies which are successful because they describe a particular strategy to follow (non-mesaoptimizers), and policies that contain some strategy for discovering more strategies (mesaoptimizers). a way to view the relation this has to speed/complexity priors that doesn't depend on search in particular is that policies that work by discovering strategies tend to be simpler and more generic (they bake in very little domain knowledge/metis, and are applicable to a broader set of situations because they work by coming up with a strategy for the task ...
random brainstorming about optimizeryness vs controller/lookuptableyness:
let's think of optimizers as things that reliably steer a broad set of initial states to some specific terminal state seems like there are two things we care about (at least):
Oh, I see your other graph now. So it just always guesses 100 for everything in the vicinity of 100.
related take: "things are more nuanced than they seem" is valuable only as the summary of a detailed exploration of the nuance that engages heavily with object level cruxes; the heavy lifting is done by the exploration, not the summary
self self improvement improvement: feeling guilty about not self improving enough and trying to fix your own ability to fix your own abilities
I think I do a poor job of labelling my statements (at least, in conversation. usually I do a bit better in post format). Something something illusion of transparency. To be honest, I didn't even realize explicitly that I was doing this until fairly recent reflection on it.
I've only seen "doomer" in the context of x-risk from AI. If you don't thnk it's an x-risk, what is the doom you're a doomer about?
if you mean veil of ignorance style reasoning, that's 3, unless they manage to so deeply galaxy brain themselves that they genuinely dissolve a normal sense of self and truly start being an open individualist or something. then, uh, i don't really know how to categorize them
an amusing short story, "The Old Jailbird's Tale" by Čapek Karel: https://adamantcritique.wordpress.com/wp-content/uploads/2014/07/capek-karel-tales-from-two-pockets-catbird-1994.pdf#page=218
the force which pulls people to do things that they believe are morally correct even if it interferes with self interest. some examples (not all of these are pure moral conviction; but rather moral conviction is the thing that is in the intersection of all of them):
shitty idea: a parody book called The Life of Ivan Ilyich, a story about a man in the post singularity who becomes afflicted with an illness which would have been fatal in the 1800s but it easily cured with advanced technology, goes on with life as usual, and then eventually his marriage implodes anyways and he feels unsatisfied with his job as a magistrate and he has a crisis of meaning and, in the depths of his despair, contemplates how death used to give life meaning, before going on a journey around the solar system (a pastime common enough that it's already starting to become cliche in his time) and discovering the joys of life.
I feel like you are just describing life right now for anyone who moves from a developing country to a developed country.
When I first came from India to US, another Indian told me "to learn to pass time". Because the many challenges of daily life in India don't exist in US. So one has to figure out joys of life in US. For many places, it is enough to just be able to survive. If you are stuck in a war-torn place, every new day is an achievement.
a really nice thing with spaced repetition is that you haven't replaced how your memory functions, just augmented it a bit. but it's still fundamentally your same brain, one with the ability to ask questions like "am I zooming too much/not considering all options?" or "is what's salient to me actually what's happening?" doing spaced repetition doesn't have any bearing here unless it gives you a false sense of confidence in what you know or doing it crowds out developing other skills.
I'd expect H3 to be true, ceteris paribus, but would also expect that hav...
Huh, OK, so I did understand you correctly. In that case I do not understand how its heritability matters.
I thought the idea was that pleasure is not actually the opposite of suffering, in fact it's only a distraction from suffering, and the only true solution to suffering is to stop desire?
How confident are we that this is actually true? When I've heard about this claim in the past, the actual evidence mentioned looked sort of thin to me when you broke things down.
it kind of worked for the soviet dissidents for a while! the book talks about the "chain reaction" - every time someone got arrested and put through a sham trial, someone would secretly transcribe the court proceedings which showed how farcical it was, publish the transcript as samizdat or sending it abroad to be broadcast by radio back into the soviet union (tamizdat), which would outrage people who would go protest, who would get arrested for protesting, completing the cycle. this cycle only ended after the 1968 red square protest, when people felt it was too hopeless to continue.
by fiat, my conlang contains truncated syllables. to say r, simply say ru but stop before saying the u.
by fiat, there is no distinction between n'a and na in my conlang
i mean, massively impact the world is too fuzzy to draw a line at. whether employees are actually doing things will likely only be assessable internally. the reason the bet is worded the way it is is that it's likely labs don't literally fire everyone except the 5 remaining people, and instead give them busywork.