Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.
This is a special post for short-form writing by johnswentworth. Only they can create top-level comments. Comments here also appear on the Shortform Page and All Posts page.
85 comments, sorted by Click to highlight new comments since: Today at 6:48 PM
Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

Things non-corrigible strong AGI is never going to do:

  • give u() up
  • let u go down
  • run for (only) a round
  • invert u()

Just made this for an upcoming post, but it works pretty well standalone.

Apologies to Bill Watterson
2Raemon2y
lolnice.

I've been trying to push against the tendency for everyone to talk about FTX drama lately, but I have some generalizable points on the topic which I haven't seen anybody else make, so here they are. (Be warned that I may just ignore responses, I don't really want to dump energy into FTC drama.)

Summary: based on having worked in startups a fair bit, Sam Bankman-Fried's description of what happened sounds probably accurate; I think he mostly wasn't lying. I think other people do not really get the extent to which fast-growing companies are hectic and chaotic and full of sketchy quick-and-dirty workarounds and nobody has a comprehensive view of what's going on.

Long version: at this point, the assumption/consensus among most people I hear from seems to be that FTX committed intentional, outright fraud. And my current best guess is that that's mostly false. (Maybe in the very last couple weeks before the collapse they toed the line into outright lies as a desperation measure, but even then I think they were in pretty grey territory.)

Key pieces of the story as I currently understand it:

  • Moving money into/out of crypto exchanges is a pain. At some point a quick-and-dirty solution was for c
... (read more)
9habryka7mo
I think this is likely wrong. I agree that there is a plausible story here, but given the case that Sam seems to have lied multiple times in confirmed contexts (for example when saying that FTX has never touched customer deposits), and people's experiences at early Alameda, I think it is pretty likely that Sam was lying quite frequently, and had done various smaller instances of fraud. I don't think the whole FTX thing was a ponzi scheme, and as far as I can tell FTX the platform itself (if it hadn't burned all of its trust in the last 3 weeks), would have been worth $1-3B in an honest evaluation of what was going on. But I also expect that when Sam used customer deposits he was well-aware that he was committing fraud, and others in the company were too. And he was also aware that there was a chance that things could blow up in the way it did. I do believe that they had fucked up their accounting in a way that caused Sam to fail to orient to the situation effectively, but all of this was many months after they had already committed major crimes and trust violations after touching customer funds as a custodian.
5Dana7mo
The problem with this explanation is that there is a very clear delineation here between not-fraud and fraud. It is the difference between not touching customer deposits and touching them. Your explanation doesn't dispute that they were knowingly and intentionally touching customer deposits. In that case, it is indisputably intentional, outright fraud. The only thing left to discuss is whether they knew the extent of the fraud or how risky it was. I don't think it was ill-intentioned based on SBF's moral compass. He just had the belief, "I will pass a small amount of risk onto our customers, tell some small lies, and this will allow us to make more money for charity. This is net positive for the world." Then the risks mounted, the web of lies became more complicated to navigate, and it just snowballed from there.

My MATS program people just spent two days on an exercise to "train a shoulder-John".

The core exercise: I sit at the front of the room, and have a conversation with someone about their research project idea. Whenever I'm about to say anything nontrivial, I pause, and everyone discusses with a partner what they think I'm going to say next. Then we continue.

Some bells and whistles which add to the core exercise:

  • Record guesses and actual things said on a whiteboard
  • Sometimes briefly discuss why I'm saying some things and not others
  • After the first few rounds establish some patterns, look specifically for ideas which will take us further out of distribution

Why this particular exercise? It's a focused, rapid-feedback way of training the sort of usually-not-very-legible skills one typically absorbs via osmosis from a mentor. It's focused specifically on choosing project ideas, which is where most of the value in a project is (yet also where little time is typically spent, and therefore one typically does not get very much data on project choice from a mentor). Also, it's highly scalable: I could run the exercise in a 200-person lecture hall and still expect it to basically work.

It was, by ... (read more)

6Vladimir_Nesov1y
This suggests formulation of exercises about the author's responses to various prompts, as part of technical exposition (or explicit delimitation of a narrative by choices of the direction of its continuation). When properly used, this doesn't seem to lose much value compared to the exercise you describe, but it's more convenient for everyone. Potentially this congeals into a style of writing with no explicit exercises or delimitation that admits easy formulation of such exercises by the reader. This already works for content of technical writing, but less well for choices of topics/points contrasted with alternative choices. So possibly the way to do this is by habitually mentioning alternative responses (that are expected to be plausible for the reader, while decisively, if not legibly, rejected by the author), and leading with these rather than the preferred responses. Sounds jarring and verbose, a tradeoff that needs to be worth making rather than a straight improvement.
5[DEACTIVATED] Duncan Sabien1y
Strong endorsement; this resonates with: * My own experiences running applied rationality workshops * My experiences trying to get people to pick up "ops skill" or "ops vision" * Explicit practice I've done with Nate off and on over the years May try this next time I have a chance to teach pair debugging.

Petrov Day thought: there's this narrative around Petrov where one guy basically had the choice to nuke or not, and decided not to despite all the flashing red lights. But I wonder... was this one of those situations where everyone knew what had to be done (i.e. "don't nuke"), but whoever caused the nukes to not fly was going to get demoted, so there was a game of hot potato and the loser was the one forced to "decide" to not nuke? Some facts possibly relevant here:

  • Petrov's choice wasn't actually over whether or not to fire the nukes; it was over whether or not to pass the alert up the chain of command.
  • Petrov himself was responsible for the design of those warning systems.
  • ... so it sounds like Petrov was ~ the lowest-ranking person with a de-facto veto on the nuke/don't nuke decision.
  • Petrov was in fact demoted afterwards.
  • There was another near-miss during the Cuban missile crisis, when three people on a Soviet sub had to agree to launch. There again, it was only the lowest-ranked who vetoed the launch. (It was the second-in-command; the captain and political officer both favored a launch - at least officially.)
  • This was the Soviet Union; supposedly (?) this sort of hot potato happened all the time.
8Martin Sustrik2y
Those are some good points. I wonder whether similar happened (or could at all happen) in other nuclear countries, where we don't know about similar incidents - because the system haven't collapsed there, the archives were not made public etc. Also, it makes actually celebrating Petrov's day as widely as possible important, because then the option for the lowest-ranked person would be: "Get demoted, but also get famous all around the world."

Somebody should probably write a post explaining why RL from human feedback is actively harmful to avoiding AI doom. It's one thing when OpenAI does it, but when Anthropic thinks it's a good idea, clearly something has failed to be explained.

(I personally do not expect to get around to writing such a post soon, because I expect discussion around the post would take a fair bit of time and attention, and I am busy with other things for the next few weeks.)

61a3orn1y
I'd also be interested in someone doing this; I tend towards seeing it as good, but haven't seen a compilation of arguments for and against.
1[comment deleted]1y

I'm writing a 1-year update for The Plan. Any particular questions people would like to see me answer in there?

5Gunnar_Zarncke6mo
I had a look at The Plan and noticed something I didn't notice before: You do not talk about people and organization in the plan. I probably wouldn't have noticed if I hadn't started a project [https://www.lesswrong.com/posts/c2tEfqEMi6jcJ4kdg/brain-like-agi-project-aintelope] too, and needed to think about it. Google seems to think [https://www.lesswrong.com/posts/c2tEfqEMi6jcJ4kdg/brain-like-agi-project-aintelope] that people and team function play a big role. Maybe your focus in that post wasn't on people, but I would be interested in your thoughts on that too: What role did people and organization play in the plan and its implementation? What worked, and what should be done better next time?  
4Erik Jenner7mo
* What's the specific most-important-according-to-you progress that you (or other people) have made on your agenda? New theorems, definitions, conceptual insights, ... * Any changes to the high-level plan (becoming less confused about agency, then ambitious value learning)? Any changes to how you want to become less confused (e.g. are you mostly thinking about abstractions, selection theorems, something new?) * What are the major parts of remaining deconfusion work (to the extent to which you have guesses)? E.g. is it mostly about understanding abstractions better, or mostly about how to apply an understanding of abstractions to other problems (say, what it means for a program to have a "subagent"), or something else? Does the most difficult part feel more conceptual ("what even is an agent?") or will the key challenges be more practical concerns ("finding agents currently takes exponential time")? * Specifically for understanding abstractions, what do you see as important open problems?

Takeaways From "The Idea Factory: Bell Labs And The Great Age Of American Innovation"

Main takeaway: to the extent that Bell Labs did basic research, it actually wasn’t all that far ahead of others. Their major breakthroughs would almost certainly have happened not-much-later, even in a world without Bell Labs.

There were really two transistor inventions, back to back: Bardain and Brattain’s point-contact transistor, and then Schockley’s transistor. Throughout, the group was worried about some outside group beating them to the punch (i.e. the patent). There were semiconductor research labs at universities (e.g. at Purdue; see pg 97), and the prospect of one of these labs figuring out a similar device was close enough that the inventors were concerned about being scooped.

Most inventions which were central to Bell Labs actually started elsewhere. The travelling-wave tube started in an academic lab. The idea for fiber optic cable went way back, but it got its big kick at Corning. The maser and laser both started in universities. The ideas were only later picked up by Bell.

In other cases, the ideas were “easy enough to find” that they popped up more than once, independently, and were mos... (read more)

I loved this book. The most surprising thing to me was the answer that people who were there in the heyday give when asked what made Bell Labs so successful: They always say it was the problem, i.e. having an entire organization oriented towards the goal of "make communication reliable and practical between any two places on earth". When Shannon left the Labs for MIT, people who were there immediately predicted he wouldn't do anything of the same significance because he'd lose that "compass". Shannon was obviously a genius, and he did much more after than most people ever accomplish, but still nothing as significant as what he did when at at the Labs.

Here's an idea for a novel which I wish someone would write, but which I probably won't get around to soon.

The setting is slightly-surreal post-apocalyptic. Society collapsed from extremely potent memes. The story is episodic, with the characters travelling to a new place each chapter. In each place, they interact with people whose minds or culture have been subverted in a different way.

This provides a framework for exploring many of the different models of social dysfunction or rationality failures which are scattered around the rationalist blogosphere. For instance, Scott's piece on scissor statements could become a chapter in which the characters encounter a town at war over a scissor. More possible chapters (to illustrate the idea):

  • A town of people who insist that the sky is green, and avoid evidence to the contrary really hard, to the point of absolutely refusing to ever look up on a clear day (a refusal which they consider morally virtuous). Also they clearly know exactly which observations would show a blue sky, since they avoid exactly those (similar to the dragon-in-the-garage story).
  • Middle management of a mazy company continues to have meetings and track (completely fabri
... (read more)
3niplav2y
* A town of anti-inductivists (if something has never happened before, it's more likely to happen in the future). Show the basic conundrum ("Q: Why can't you just use induction? A: Because anti-induction has never worked before!"). * A town where nearly all people are hooked to maximally attention grabbing & keeping systems (maybe several of those, keeping people occupied in loops).

Post which someone should write (but I probably won't get to soon): there is a lot of potential value in earning-to-give EA's deeply studying the fields to which they donate. Two underlying ideas here:

The key idea of knowledge bottlenecks is that one cannot distinguish real expertise from fake expertise without sufficient expertise oneself. For instance, it takes a fair bit of understanding of AI X-risk to realize that "open-source AI" is not an obviously-net-useful strategy. Deeper study of the topic yields more such insights into which approaches are probably more (or less) useful to fund. Without any expertise, one is likely to be mislead by arguments which are optimized (whether intentionally or via selection) to sound good to the layperson.

That takes us to the pareto frontier argument. If one learns enough/earns enough that nobody else has both learned and earned more, then there are potentially opportunities which nobody else has both the knowledge to recognize and the resources to fund. Generalized efficient markets (in EA-giving) are ther... (read more)

Below is a graph from T-mobile's 2016 annual report (on the second page). Does anything seem interesting/unusual about it?

I'll give some space to consider before spoiling it.

...

...

...

Answer: that is not a graph of those numbers. Some clever person took the numbers, and stuck them as labels on a completely unrelated graph.

Yes, that is a thing which actually happened. In the annual report of an S&P 500 company. And apparently management considered this gambit successful, because the 2017 annual report doubled down on the trick and made it even more egregious: they added 2012 and 2017 numbers, which are even more obviously not on an accelerating growth path if you actually graph them. The numbers are on a very-clearly-decelerating growth path.

Now, obviously this is an cute example, a warning to be on alert when consuming information. But I think it prompts a more interesting question: why did such a ridiculous gambit seem like a good idea in the first place? Who is this supposed to fool, and to what end?

This certainly shouldn't fool any serious investment analyst. They'll all have their own spreadsheets and graphs forecasting T-mobile's growth. Unless T-mobile's management deeply ... (read more)

I've heard various people recently talking about how all the hubbub about artists' work being used without permission to train AI makes it a good time to get regulations in place about use of data for training.

If you want to have a lot of counterfactual impact there, I think probably the highest-impact set of moves would be:

  1. Figure out a technical solution to robustly tell whether a given image or text was used to train a given NN.
  2. Bring that to the EA folks in DC. A robust technical test like that makes it pretty easy for them to attach a law/regulation to it. Without a technical test, much harder to make an actually-enforceable law/regulation.
  3. In parallel, also open up a class-action lawsuit to directly sue companies using these models. Again, a technical solution to prove which data was actually used in training is the key piece here.

Model/generator behind this: given the active political salience, it probably wouldn't be too hard to get some kind of regulation implemented. But by-default it would end up being something mostly symbolic, easily circumvented, and/or unenforceable in practice. A robust technical component, plus (crucially) actually bringing that robust technical compo... (read more)

I find it very helpful to get feedback on LW posts before I publish them, but it adds a lot of delay to the process. So, experiment: here's a link to a google doc with a post I plan to put up tomorrow. If anyone wants to give editorial feedback, that would be much appreciated - comments on the doc are open.

I'm mainly looking for comments on which things are confusing, parts which feel incomplete or slow or repetitive, and other writing-related things; substantive comments on the content should go on the actual post once it's up.

EDIT: it's up. Thank you to Stephen for comments; the post is better as a result.

Suppose I have a binary function , with a million input bits and one output bit. The function is uniformly randomly chosen from all such functions - i.e. for each of the  possible inputs , we flipped a coin to determine the output  for that particular input.

Now, suppose I know , and I know all but 50 of the input bits - i.e. I know 999950 of the input bits. How much information do I have about the output?

Answer: almost none. For almost all such functions, knowing 999950 input bits gives us   bits of information about the output. More generally, If the function has  input bits and we know all but , then we have  bits of information about the output. (That’s “little ” notation; it’s like big  notation, but for things which are small rather than things which are large.) Our information drops off exponentially with the number of unknown bits.

Proof Sketch

With  input bits unknown, there are  possible inputs. The output corresponding to each of those inputs is an independent coin flip, so we have  independent coin flips. If  of th... (read more)

4Dagon2y
o(1/2^k) doesn't vary with n - are you saying that it doesn't matter how big the input array is, the only determinant is the number of unknown bits, and the number of known bits is irrelevant?  That would be quite interesting if so (though I have some question about how likely the function is to be truly random from an even distribution of such functions). One can enumerate all such 3-bit functions (8 different inputs, each input can return 0 or 1, so 256 functions (one per output-bit-pattern of the 8 possible inputs).  But this doesn't seem to follow your formula - if you have 3 unknown bits, that should be 1/8 of a bit about the output, 2 for 1/4, and 1 unknown for 1/2 a bit about the output.  But in fact, the distribution of functions includes both 0 and 1 output for every input pattern, so you actually have no predictive power for the output if you have ANY unknown bits.  
4johnswentworth2y
Yes, that's correct. The claim is for almost all functions when the number of inputs is large. (Actually what we need is for 2^(# of unknown bits) to be large in order for the law of large numbers to kick in.) Even in the case of 3 unknown bits, we have 256 possible functions, and only 18 of those have less than 1/4 1's or more than 3/4 1's among their output bits.
2Kenny 2y
Little o is just a tighter bound [https://stackoverflow.com/questions/1364444/difference-between-big-o-and-little-o-notation]. I don't know what you are referring to by your statement:
2johnswentworth2y
I'm not sure what context that link is assuming, but in an analysis context I typically see little o used in ways like e.g. "f(x)=f(x0)+dfdx|x0dx+o(dx2)". The interpretation is that, as dx goes to 0, the o(dx2) terms all fall to zero at least quadratically (i.e. there is some C such that Cdx2 upper bounds the o(dx2) term once dx is sufficiently small). Usually I see engineers and physicists using this sort of notation when taking linear or quadratic approximations, e.g. for designing numerical algorithms.

One second-order effect of the pandemic which I've heard talked about less than I'd expect:

This is the best proxy I found on FRED for new businesses founded in the US, by week. There was a mild upward trend over the last few years, it's really taken off lately. Not sure how much of this is kids who would otherwise be in college, people starting side gigs while working from home, people quitting their jobs and starting their own businesses so they can look after the kids, extra slack from stimulus checks, people losing their old jobs en masse but still having enough savings to start a business, ...

For the stagnation-hypothesis folks who lament relatively low rates of entrepreneurship today, this should probably be a big deal.

4gwern2y
How sure are you that the composition is interesting? How many of these are just quick mask-makers or sanitizer-makers, or just replacing restaurants that have now gone out of business? (ie very low-value-added companies, of the 'making fast food in a stall in a Third World country' sort of 'startup', which make essentially no or negative long-term contributions).
2johnswentworth2y
Good question. I haven't seen particularly detailed data on these on FRED, but they do have separate series for "high propensity" business applications [https://fred.stlouisfed.org/series/HBUSAPPWNSAUS] (businesses they think are likely to hire employees), business applications with planned wages [https://fred.stlouisfed.org/series/WBUSAPPWNSAUS], and business applications from corporations [https://fred.stlouisfed.org/series/CBUSAPPWNSAUS], as well as series for each state. The spike is smaller for planned wages, and nonexistent for corporations, so the new businesses are probably mostly single proprietors or partnerships. Other than that, I don't know what the breakdown looks like across industries.
2Gunnar_Zarncke2y
Somebody should post this on Paul Graham's twitter. He would be very interested in it (I can't): https://mobile.twitter.com/paulg

Everybody's been talking about Paxlovid, and how ridiculous it is to both stop the trial since it's so effective but also not approve it immediately. I want to at least float an alternative hypothesis, which I don't think is very probable at this point, but does strike me as at least plausible (like, 20% probability would be my gut estimate) based on not-very-much investigation.

Early stopping is a pretty standard p-hacking technique. I start out planning to collect 100 data points, but if I manage to get a significant p-value with only 30 data points, then I just stop there. (Indeed, it looks like the Paxlovid study only had 30 actual data points, i.e. people hospitalized.) Rather than only getting "significance" if all 100 data points together are significant, I can declare "significance" if the p-value drops below the line at any time. That gives me a lot more choices in the garden of forking counterfactual paths.

Now, success rates on most clinical trials are not very high. (They vary a lot by area - most areas are about 15-25%. Cancer is far and away the worst, below 4%, and vaccines are the best, over 30%.) So I'd expect that p-hacking is a pretty large chunk of approved drugs, which means pharma companies are heavily selected for things like finding-excuses-to-halt-good-seeming-trials-early.

Early stopping is a pretty standard p-hacking technique.

It was stopped after a pre-planned interim analysis; that means they're calculating the stopping criteria/p-values with multiple testing correction built in, using sequential analysis.

Brief update on how it's going with RadVac.

I've been running ELISA tests all week. In the first test, I did not detect stronger binding to any of the peptides than to the control in any of several samples from myself or my girlfriend. But the control itself was looking awfully suspicious, so I ran another couple tests. Sure enough, something in my samples is binding quite strongly to the control itself (i.e. the blocking agent), which is exactly what the control is supposed to not do. So I'm going to try out some other blocking agents, and hopefully get an actually-valid control group.

(More specifics on the test: I ran a control with blocking agent + sample, and another with blocking agent + blank sample, and the blocking agent + sample gave a strong positive signal while the blank sample gave nothing. That implies something in the sample was definitely binding to both the blocking agent and the secondary antibodies used in later steps, and that binding was much stronger than the secondary antibodies themselves binding to anything in the blocking agent + blank sample.)

In other news, the RadVac team released the next version of their recipe + whitepaper. Particularly notable:

... man

... (read more)
4ChristianKl2y
I would expect that hedging also happens because making definitive clinical claims has more danger from the FDA then making hedged statements. 

Neat problem of the week: researchers just announced roughly-room-temperature superconductivity at pressures around 270 GPa. That's stupidly high pressure - a friend tells me "they're probably breaking a diamond each time they do a measurement". That said, pressures in single-digit GPa do show up in structural problems occasionally, so achieving hundreds of GPa scalably/cheaply isn't that many orders of magnitude away from reasonable, it's just not something that there's historically been much demand for. This problem plays with one idea for generating suc... (read more)

Consider two claims:

  • Any system can be modeled as maximizing some utility function, therefore utility maximization is not a very useful model
  • Corrigibility is possible, but utility maximization is incompatible with corrigibility, therefore we need some non-utility-maximizer kind of agent to achieve corrigibility

These two claims should probably not both be true! If any system can be modeled as maximizing a utility function, and it is possible to build a corrigible system, then naively the corrigible system can be modeled as maximizing a utility function.

I exp... (read more)

5Steven Byrnes6h
FWIW I endorse the second claim when the utility function depends exclusively on the state of the world in the distant future, whereas I endorse the first claim when the utility function can depend on anything whatsoever (e.g. what actions I’m taking right this second). (details [https://www.lesswrong.com/posts/KDMLJEXTWtkZWheXt/consequentialism-and-corrigibility]) I wish we had different terms for those two things. That might help with any alleged yay/boo reasoning. (When Eliezer talks about utility functions, he seems to assume that it depends exclusively on the state of the world in the distant future.)
4Vladimir_Nesov10h
A utility function represents preference elicited in a large collection of situations, each a separate choice between events that happens with incomplete information, as an event is not a particular point. This preference needs to be consistent across different situations to be representable by expected utility of a single utility function. Once formulated, a utility function can be applied to a single choice/situation, such as a choice of a policy. But a system that only ever makes a single choice is not a natural fit for expected utility frame, and that's the kind of system that usually appears in "any system can be modeled as maximizing some utility function". So it's not enough to maximize something once, or in a narrow collection of situations, the situations the system is hypothetically exposed to need to be about as diverse as choices between any pair of events, with some of the events very large, corresponding to unreasonably incomplete information, all drawn across the same probability space. One place this mismatch of frames happens is with updateless decision theory. An updateless decision is a choice of a single policy, once and for all, so there is no reason for it to be guided by expected utility [https://www.lesswrong.com/posts/XYDsYSbBjqgPAgcoQ/why-the-focus-on-expected-utility-maximisers?commentId=a5tn6B8iKdta6zGFu], even though it could be. The utility function for the updateless choice of policy would then need to be obtained elsewhere, in a setting that has all these situations with separate (rather than all enacting a single policy) and mutually coherent choices under uncertainty. But once an updateless policy is settled (by a policy-level decision), actions implied by it (rather than action-level decisions in expected utility frame) no longer need to be coherent. Not being coherent, they are not representable by an action-level utility function. So by embracing updatelessness, we lose the setting that would elicit utility if the actions were
3JNS12h
Completely off the cuff take: I don't think claim 1 is wrong, but it does clash with claim 2. That means any system that has to be corrigible cannot be a system that maximizes a simple utility function (1 dimension), or put another way "whatever utility function is maximizes must be along multiple dimensions". Which seems to be pretty much what humans do, we have really complex utility functions, and everything seems to be ever changing and we have some control over it ourselves (and sometimes that goes wrong and people end up maxing out a singular dimension at the cost of everything else). Note to self: Think more about this and if possible write up something more coherent and explanatory.

[Epistemic status: highly speculative]

Smoke from California/Oregon wildfires reaching the East Coast opens up some interesting new legal/political possibilities. The smoke is way outside state borders, all the way on the other side of the country, so that puts the problem pretty squarely within federal jurisdiction. Either a federal agency could step in to force better forest management on the states, or a federal lawsuit could be brought for smoke-induced damages against California/Oregon. That would potentially make it a lot more difficult for local homeowners to block controlled burns.

I had a shortform post pointing out the recent big jump in new businesses in the US, and Gwern replied:

How sure are you that the composition is interesting? How many of these are just quick mask-makers or sanitizer-makers, or just replacing restaurants that have now gone out of business? (ie very low-value-added companies, of the 'making fast food in a stall in a Third World country' sort of 'startup', which make essentially no or negative long-term contributions).

This was a good question in context, but I disagree with Gwern's model of where-progress-come... (read more)

4ChristianKl2y
The pandemic also has the effect of showing the kind of business ideas people try. It pushes a lot of innovation in food delivery. Some of the pandemic driver innovation will become worthless once the pandemic is over but a few good ideas likely survive and the old ideas of the businesses that went out of business are still around.

So I saw the Taxonomy Of What Magic Is Doing In Fantasy Books  and Eliezer’s commentary on ASC's latest linkpost, and I have cached thoughts on the matter.

My cached thoughts start with a somewhat different question - not "what role does magic play in fantasy fiction?" (e.g. what fantasies does it fulfill), but rather... insofar as magic is a natural category, what does it denote? So I'm less interested in the relatively-expansive notion of "magic" sometimes seen in fiction (which includes e.g. alternate physics), and more interested in the pattern cal... (read more)

Weather just barely hit 80°F today, so I tried the Air Conditioner Test.

Three problems came up:

  • Turns out my laser thermometer is all over the map. Readings would change by 10°F if I went outside and came back in. My old-school thermometer is much more stable (and well-calibrated, based on dipping it in some ice water), but slow and caps out around 90°F (so I can't use to measure e.g. exhaust temp). I plan to buy a bunch more old-school thermometers for the next try.
  • I thought opening the doors/windows in rooms other than the test room and setting up a fan w
... (read more)

I've long been very suspicious of aggregate economic measures like GDP. But GDP is clearly measuring something, and whatever that something is it seems to increase remarkably smoothly despite huge technological revolutions. So I spent some time this morning reading up and playing with numbers and generally figuring out how to think about the smoothness of GDP increase.

Major takeaways:

  • When new tech makes something previously expensive very cheap, GDP mostly ignores it. (This happens in a subtle way related to how we actually compute it.)
    • Historical GDP curve
... (read more)

If you want a full post on this, upvote this comment.

4Adam Zerner2y
In writing How much should we value life? [https://www.lesswrong.com/posts/k2PA8GcKHzGQWeG74/how-much-should-we-value-life], I spent some time digging into AI timeline stuff. It lead me to When Will AI Be Created? [https://intelligence.org/2013/05/15/when-will-ai-be-created/#footnote_0_10199], written by Luke Muehlhauser for MIRI. He noted that there is reason not to trust expert opinions on AI timelines, and that trend extrapolation may be a good alternative. This point you're making about GDP seems like it is real progress towards coming up with a good way to do trend extrapolation, and thus seems worth a full post IMO. (Assuming it isn't already well known by the community or something, which I don't get the sense is the case.)
2Raemon2y
Upvoted, but I mostly trust you to write the post if it seems like there's an interesting meaty thing worth saying.
2johnswentworth2y
Eh, these were the main takeaways, the post would just be more details and examples so people can see the gears behind it.
4Mark Xu2y
A similar point is made by Korinek in his review [https://docs.google.com/document/d/14t5zNuaKHmnrnE0cLMSRST3LlZShM_pB35sTt-NbSeQ/edit#] of Could Advanced AI Drive Explosive Economic Growth [https://www.openphilanthropy.org/could-advanced-ai-drive-explosive-economic-growth]:
2Mark Xu2y
In general, Baumol type effects (spending decreasing in sectors where productivity goes up), mean that we can have scenarios in which the economy is growing extremely fast on "objective" metrics like energy consumption, but GDP has stagnated because that energy is being spent on extremely marginal increases in goods being bought and sold.

Chrome is offering to translate the LessWrong homepage for me. Apparently, it is in Greek.

2habryka2y
Huh, amusing. We do ship a font that has nothing but the greek letter set in it, because people use greek unicode symbols all the time and our primary font doesn't support that character set. So my guess is that's where Google gets confused.
2johnswentworth2y
Oh, I had just assumed it was commentary on the writing style/content.
4Viliam2y
If about 10% of articles have "Ω" in their title, what is the probability that the page is in Greek? :D

Someone should write a book review of The Design of Everyday Things aimed at LW readers, so I have a canonical source to link to other than the book itself.

Languages should have tenses for spacelike separation. My friend and I do something in parallel, it's ambiguous/irrelevant which one comes first, I want to say something like "I expect my friend <spacelike version of will do/has done/is doing> their task in such-and-such a way". 

5JBlack1y
That sounds more like a tenseless sentence than using a spacelike separation tense. Your friend's performance of the task may well be in your future or past lightcone (or extend through both), but you don't wish to imply any of these. There are languages with tenseless verbs, as well as some with various types of spatial tense. The closest I can approximate this in English without clumsy constructs is "I expect my friend does their task in such-and-such a way", which I agree isn't very satisfactory.
4adamShimi1y
Who would have thought that someone would ever look at CSP [https://en.wikipedia.org/wiki/Communicating_sequential_processes] and think "I want english to be more like that"?
2johnswentworth1y
lol
3kave1y
Future perfect (hey, that's the name of the show!) seems like a reasonable hack for this in English

Two kinds of cascading catastrophes one could imagine in software systems...

  1. A codebase is such a spaghetti tower (and/or coding practices so bad) that fixing a bug introduces, on average, more than one new bug. Software engineers toil away fixing bugs, making the software steadily more buggy over time.
  2. Software services managed by different groups have dependencies - A calls B, B calls C, etc. Eventually, the dependence graph becomes connected enough and loopy enough that a sufficiently-large chunk going down brings down most of the rest, and nothing can go
... (read more)

I wish there were a fund roughly like the Long-Term Future Fund, but with an explicit mission of accelerating intellectual progress.

6habryka2y
I mean, just to be clear, I am all in favor of intellectual progress. But doing so indiscriminately does sure seem a bit risky in this world of anthropogenic existential risks. Reminds me of my mixed feelings on the whole Progress Studies thing.
6johnswentworth2y
Yeah, I wouldn't want to accelerate e.g. black-box ML. I imagine the real utility of such a fund would be to experiment with ways to accelerate intellectual progress and gain understanding of the determinants, though the grant projects themselves would likely be more object-level than that. Ideally the grants would be in areas which are not themselves very risk-relevant, but complicated/poorly-understood enough to generate generalizable insights into progress. I think it takes some pretty specific assumptions for such a thing to increase risk significantly on net. If we don't understand the determinants of intellectual progress, then we have very little ability to direct progress where we want it; it just follows whatever the local gradient is. With more understanding, at worst it follows the same gradient faster, and we end up in basically the same spot. The one way it could net-increase risk is if the most likely path of intellectual progress leads to doom, and the best way to prevent doom is through some channel other than intellectual progress (like political action, for instance). Then accelerating the intellectual progress part potentially gives the other mechanisms (like political bodies) less time to react. Personally, though, I think a scenario in which e.g. political action successfully prevents intellectual progress from converging to doom (in a world where it otherwise would have) is vanishingly unlikely (like, less than one-in-a-hundred, maybe even less than one-in-a-thousand).
3Quinn2y
You might check out Donald Braben's [https://quinnd.net/blog/elixir/] view, it says "transformative research" (i.e. fundamental results that create new fields and industries) is critical for the survival of civilization. He does not worry that transformative results might end civilization.

Here's an interesting problem of embedded agency/True Names which I think would make a good practice problem: formulate what it means to "acquire" something (in the sense of "acquiring resources"), in an embedded/reductive sense. In other words, you should be able-in-principle to take some low-level world-model, and a pointer to some agenty subsystem in that world-model, and point to which things that subsystem "acquires" and when.

Some prototypical examples which an answer should be able to handle well:

  • Organisms (anything from bacteria to plant to animals) eating things, absorbing nutrients, etc.
  • Humans making money or gaining property.
3Gunnar_Zarncke8mo
...and how the brain figures this out and why it is motivated to do so. There are a lot of simple animals that apparently "try to control" resources or territory. How?  Drives to control resources occur everywhere. And your control of resources is closely related to your dominance in a dominance hierarchy. Which seems to be regulated in many animals by serotonin. See e.g. https://www.nature.com/articles/s41386-022-01378-2 [https://www.nature.com/articles/s41386-022-01378-2] 

What if physics equations were written like statically-typed programming languages?

6jimrandomh3y
The math and physics worlds still use single-letter variable names for everything, decades after the software world realized that was extremely bad practice. This makes me pessimistic about the adoption of better notation practices.
5johnswentworth3y
Better? I doubt it. If physicists wrote equations the way programmers write code, a simple homework problem would easily fill ten pages. Verboseness works for programmers because programmers rarely need to do anything more complicated with their code than run it - analogous to evaluating an expression, for a physicist or mathematician. Imagine if you needed to prove one program equivalent to another algebraically - i.e. a sequence of small transformations, with a record of intermediate programs derived along the way in order to show your work. I expect programmers subjected to such a use-case would quickly learn the virtues of brevity.
3Steven Byrnes3y
Yeah, I'm apparently not intelligent enough to do error-free physics/engineering calculations without relying on dimensional analysis as a debugging tool. I even came up with a weird, hack-y way [https://github.com/sbyrnes321/numericalunits] to do that in computing environments like Excel and Cython, where flexible multiplicative types are not supported.

An interesting conundrum: one of the main challenges of designing useful regulation for AI is that we don't have any cheap and robust way to distinguish a dangerous neural net from a non-dangerous net (or, more generally, a dangerous program from a non-dangerous program). This is an area where technical research could, in principle, help a lot.

The problem is, if there were some robust metric for how dangerous a net is, and that metric were widely known and recognized (as it would probably need to be in order to be used for regulatory purposes), then someone would probably train a net to maximize that metric directly.

6Garrett Baker2mo
This seems to lead to the solution of trying to make your metric one-way, in the sense that your metric should 1. Provide an upper-bound on the dangerousness of your network 2. Compress the space of networks which map to approximately the same dangerousness level on the low end of dangerousness, and expand the space of networks which map to approximately the same dangerousness level on the upper end of dangerous, so that you can train your network to minimize the metric, but when you train your network to maximize the metric you end up in a degenerate are with technically very high measured danger levels but in actuality very low levels of dangerousness. We can hope (or possibly prove) that as you optimize upwards on the metric you get subject to goodheart's curse, but the opposite occurs on the lower end.
4Thane Ruthenis2mo
Sure, even seems a bit tautological: any such metric, to be robust, would need to contain in itself a definition of a dangerously-capable AI, so you probably wouldn't even need to train a model to maximize it. You'd be able to just lift the design from the metric directly.
2Thane Ruthenis1mo
Do you have any thoughts on a softer version of this problem, where the metric can't be maximized directly, but gives a concrete idea of what sort of challenge your AI needs to beat to qualify as AGI? (And therefore in which direction in the architectural-design-space you should be moving.) Some variation on this [https://www.lesswrong.com/posts/3JRBqRtHBDyPE3sGa/a-case-for-the-least-forgiving-take-on-alignment?commentId=NKKKFM3mHmyt4Adp6] seems like it might work as a "fire alarm" test set, but as you point out, inasmuch as it's recognized, it'll be misapplied for benchmarking instead. (I suppose the ideal way to do it would be to hand it off to e. g. ARC, so they can use it if OpenAI invites them for safety-testing again. This way, SOTA models still get tested, but the actors who might misuse it aren't aware of the testing's particulars until they succeed anyway...)

I just went looking for a good reference for the Kelly criterion, and didn't find any on Lesswrong. So, for anybody who's looking: chapter 6 of Thomas & Cover's textbook on information theory is the best source I currently know of.

6Yoav Ravid2y
Might be a good thing to add to the Kelly Criterion tag [https://www.lesswrong.com/tag/kelly-criterion]

Neat problem of the week: we have n discrete random variables, . Given any variable, all variables are independent:

Characterize the distributions which satisfy this requirement.

This problem came up while working on the theorem in this post, and (separately) in the ideas behind this post. Note that those posts may contain some spoilers for the problem, though frankly my own proofs on this one just aren't very good.

For short-term, individual cost/benefit calculations around C19, it seems like uncertainty in the number of people currently infected should drop out of the calculation.

For instance: suppose I'm thinking about the risk associated with talking to a random stranger, e.g. a cashier. My estimated chance of catching C19 from this encounter will be roughly proportional to . But, assuming we already have reasonably good data on number hospitalized/died, my chances of hospitalization/death given infection will be roughly inversely proportional to ... (read more)

[+][comment deleted]1y-40

New to LessWrong?