The US FDA (U.S. Food and Drug Administration)'s current advice on what to do about covid-19 still pretty bad.
Hand-washing and food safety seem to just be wrong, as far as we can tell covid-19 is almost entirely transmitted in the air, not on hands or food; hand-washing is a good thing to do but it won't help against covid-19 and talking about it displaces talk about things that actually do help.
6 feet of distance is completely irrelevant inside, but superfluous outside. Inside, distance doesn't matter - time does. Outside is so much safer than inside that you don't need to think about distance, you need to think about spending less time inside [in a space shared with other people] and more time outside.
Cloth face coverings are suboptimal compared to N95 or P100 masks and you shouldn't wear a cloth face covering unless you are in a dire situation where N-95 or P100 isn't available. Of course it's better than not wearing a mask, but that is a very low standard.
Donating blood is just irrelevant right now, we need to eliminate the virus. Yes, it's nice to help people, but talking about blood donation crowds out information that will help to eliminate the virus.
Reporting fake tests is not exactly the most important thing that ordinary people need to be thinking about. Sure, if you happen to come across this info, report it. But this is a distraction that displaces talk about what actually works.
Essentially every item on the FDA graphic is wrong.
In fact the CDC is still saying not to use N95 masks, in order to prevent supply shortages. This is incredibly stupid - we are a whole year into covid-19, there is no excuse for supply shortages, and if people are told not to wear them then there will never be an incentive to make more of them.
6 feet of distance is completely irrelevant inside, but superfluous outside.
That seems to be a bold claim. Do you have a link to a page that goes into more detail on the evidence for it?
In fact the CDC is still saying not to use N95 masks, in order to prevent supply shortages. This is incredibly stupid - we are a whole year into covid-19, there is no excuse for supply shortages, and if people are told not to wear them then there will never be an incentive to make more of them.
Here in Germany Bavaria decided as a first step to make N95 masks required when using public transport and shopping and it's possible that more German states will adopt this policy as time goes on.
One weird trick for estimating the expectation of Lognormally distributed random variables:
If you have a variable X that you think is somewhere between 1 and 100 and is Lognormally distributed, you can model it as being a random variable with distribution ~ Lognormal(1,1) - that is, the logarithm has a distribution ~ Normal(1,1).
What is the expectation of X?
Naively, you might say that since the expectation of log(X) is 1, the expectation of X is 10^1, or 10. That makes sense, 10 is at the midpoint of 1 and 100 on a log scale.
This is wrong though. The chances of larger values dominate the expectation or average of X.
But how can you estimate that correction? It turns out that the rule you need is 10^(1 + 1.15*1^2) ≈ 141.
In general, if X ~ Lognormal(a, b) where we are working to base 10 rather than base e, this is the rule you need:
E(X) = 10^(a + 1.15*b^2)
The 1.15 is actually ln(10)/2.
For a product of several independent lognormals, you can just multiply these together, which means adding in the exponent. If you have 2 or 3 things which are all lognormal, the variance-associated corrections can easily add up to quite a lot.
Remember: add 1.15 times the sum of squares of log-variances!
regrettably I have forgotten (or never knew) the proof but it is on Wikipedia
https://en.wikipedia.org/wiki/Log-normal_distribution
I suspect that it is some fairly low-grade integral/substitution trick
The Contrarian 'AI Alignment' Agenda
Overall Thesis: technical alignment is generally irrelevant to outcomes, and almost everyone in the AI Alignment field is stuck with this incorrect assumption, working on technical alignment of LLM models
(1) aligned superintelligence that is provably logically realizable [already proved]
(2) aligned superintelligence is not just logically but also physically realizable [TBD]
(3) ML interpretability/mechanistic interpretability cannot possibly be logically necessary for aligned superintelligence [TBD]
(4) ML interpretability/mechanistic interpretability cannot possibly be logically sufficient for aligned superintelligence [TBD]
(5) given certain minimal intelligence, minimal emulation ability of humans by AI (e.g. understands common-sense morality and cause and effect) and of AI by humans (humans can do multiplications etc) the internal details of AI models cannot possibly make a difference to the set of realizable good outcomes, though they can make a difference to the ease/efficiency of realizing them [TBD]
(6) given near-perfect or perfect technical alignment (=AI will do what the creators ask of it with correct intent) awful outcomes are Nash Equilibrium for rational agents [TBD]
(7) small or even large alignment deviations make no fundamental difference to outcomes - the boundary between good/bad is determined by game theory, mechanism design and initial conditions, and only by a satisficing condition on alignment fidelity which is below the level of alignment of current humans (and AIs) [TBD]
(8) There is no such thing as superintelligence anyway because intelligence factors into many specific expert systems rather than one all-encompassing general purpose thinker. No human has a job as a “thinker” - we are all quite specialized. Thus, it doesn’t make sense to talk about “aligning superintelligence”, but rather about “aligning civilization” (or some other entity which has the ability to control outcomes) [TBD]
No human has a job as scribe, because literacy is 90%+.
I don't think that unipolar/multipolar scenarios differ greatly in outcomes.
No human has a job as scribe
Yes, correct. But people have jobs as copywriters, secretaries, etc. People specialize, because that is the optimal way to get stuff done.
it doesn’t make sense to talk about “aligning superintelligence”, but rather about “aligning civilization” (or some other entity which has the ability to control outcomes)
The key insight here is that
(1) "Entities which do in fact control outcomes"
and
(2) "Entities which are near-optimal at solving the specific problem of grabbing power and wielding it"
and
(3) "Entities which are good at correctly solving a broad range of information processing/optimization problems"
are three distinct sets of entities which the Yudkowsky/Bostrom/Russell paradigm of AI risk has smooshed into one ("The Godlike AI will be (3) so therefore it will be (2) so therefore it will be (1)!"). But reality may simply not work like that and if you look at the real world, (1), (2) and (3) are all distinct sets.
The gap between (3) and (2) is the advantage of specialization. Problem-solving is not a linear scale of goodness, it's an expanding cone where advances in some directions are irrelevant to other directions.
The gap between (1) and (2) - the difference between being best at getting power and actually having the most power - is the advantage of the incumbent. Powerful incumbents can be highly suboptimal and still win because of things like network effects, agglomerative effects, defender's advantage and so on.
There is also another gap here. It's the gap between making entities that are generically obedient, and making a power-structure that produces good outcomes. What is that gap? Well, entities can be generically obedient but still end up producing bad outcomes because of:
(a) coordination problems (see World War I)
(b) information problems (see things like the promotion of lobotomies or HRT for middle-aged women)
(c) political economy problems (see things like NIMBYism, banning plastic straws, TurboTax corruption)
Problems of type (a) happen when everyone wants a good outcome, but they can't coordinate on it and defection strategies are dominant so people get the bad Nash Equilibrium
Problems of type (b) happen when everyone obediently walks off a cliff together. Supporting things like HRT for middle-aged or drinking a glass of red wine per week women was backed by science, but the science was actually bunk. People like to copy each other and obedience makes this worse because dissenters are punished more. They're being disobedient, you see!
Problems of type (c) happen because a small group of people actually benefit from making the world worse, and it often turns out that that small group are the ones who get to decide whether to perpetuate that particular way of making the world worse!
For an example of the crushing advantage of specialization, see this tweet about how a tiny LLM with specialized training for multiplication of large numbers is better at it than cutting-edge general purpose LLMs.
There may be no animal welfare gain to veganism
I remain unconvinced that there is any animal welfare gain to vegi/veganism, farm animals have a strong desire to exist and if we stopped eating them they would stop existing.
Vegi/veganism exists for reasons of signalling, it would be surprising if it had any large net benefits other than signalling.
On top of this, the cost to mitigate most of the aspects of farming that animals disprefer is likely vastly smaller than the harms to human health.
Back of the envelope calculation is that making farming highly preferable to nonexistence for beef cattle raises the price by 25%-50%. I have some sources that ethically raised beef cattle has a cost of production of slightly more than $4.17/lb. Chicken has an ethical cost of production that's $2.64/lb vs $0.87/lb (from the same source). But, taking into account various ethics-independent overheads the consumer will not see those prices. Like, I cannot buy chicken for $0.87/lb, I pay about $3.25/lb. So I suspect that the true difference that the consumer would see is in the 25%-50% range. The same source above gives a smaller gap for pork - $6.76/lb vs $5.28/lb.
So, we could pay about 33% more for ethical meat that gives animals lives that are definitely preferable to nonexistence. The average consumer apparently spends about $1000/year on meat. So, that's about 70 years * $333 = $23,000
Now, if we conservatively assume that vegi/veganism costs say 2 years of life expectancy adjusted for quality due to nutritional deficiencies (ignore the pleasure of eating meat here, and also ignoring the value to the animals of their own lives) - with a statistical value of life of $10 million that's a cost of about $300,000.
If we value animal lives at say k% of a human life per unit time, and for simplicity assume that a person eats only $1000 of beef per year ~= 200 lb ~= 1/2 a cow, then each person causes the existence of about 0.75 cows on a permanent basis, each living for about 18 months, which is valued at 0.75k%.$10M. Vegans do not usually give an explicit value for k. Is an animal life worth the same as a human life per year? 1/10th? 1/20th? 1/100th? In any case, it doesn't really matter what you pick for this, it's overdetermined here.
So, veganism fails cost-benefit analysis based on these assumptions, compared to the option of just paying a bit extra for farming techniques that are more preferable to animals at an acceptably elevated cost.
Of course you could argue that veganism is good for human health, but I believe that is wrong due to bias and confounding (there are many similar screwups where a confounding effect due to something being popular with the upper class swamps a causal effect in the other direction). There are, as far as I am aware, no good RCTs on veganism.
In summary, veganism is a signalling game that fails rational cost-benefit analysis.
This sounds to me like: "freeing your slaves is virtue signaling, because abolishing slavery is better". I agree with the second part, but it can be quite difficult for an individual or a small group to abolish slavery, while freeing your slaves is something you can do right now (and then suffer the economical consequences).
If I had a magical button that would change all meat factories into humane places, I would press it.
If there was a referendum on making humane farms mandatory, I would vote yes.
In the meanwhile, I can contribute a tiny bit to the reduction of animal suffering by reducing my meat consumption.
You may call it virtue signaling, I call it taking the available option, instead of dreaming about hypothetically better options that are currently not available.
I think this doesn't make sense any more now that veganism is such a popular and influential movement that influences government policy and has huge control over culture.
But a slightly different version of this is that because there's no signalling value in a collective decision to impose welfare standards, it's very hard to turn into a political movement. So we may be looking at a heavily constrained system.
Nitpick: You did not prove that veganism is a signalling game. It might, but it doesn't follow. People might be vegan for many reasons, e.g., taste, different ethical framework, different key assumptions, habit, ...
Yes, I didn't address that here. But I think anyone who is vegan for nonsignalling reasons is sort of mistaken.
Like, I cannot buy chicken for $0.87/lb, I pay about $6.50/lb
I'm sorry, what? Like, I can in fact go buy boneless chicken thighs for $6.50/lb at Whole Foods in the Bay Area, but that is not what the average consumer is paying. Prices are in fact more like $1/lb for drumsticks, $1.5/lb for whole birds, $3/lb for boneless thighs/breasts.
Preventing deceptive AI misalignment via something like a GAN architecture seems fruitful - you have a generator network that performs some task T, with a score function T().
You then create a dataset of special tasks within the scope of T which have a deceptive answer, and an honest answer which scores lower according to T(). You split this deceptive alignment dataset into a train set and a test set.
Then you train both the generator network and a separate discriminator network - the discriminator is trained to spot deception using the training set and generator is rewarded for maximizing the score T() but penalized when it is caught cheating by the discriminator on the test set. Somehow you prevent the discriminator from memorizing the test set - e.g. you stream in a few examples per iteration and you don't reuse them. The discriminator may have access to the weights of the generator, or to its chain of thought.
There are some potential extensions of this line of work, e.g. you can have a holdout set of deceptive alignment examples that neither the generator nor the discriminator have seen, and you can use these to track progress and to see how the process is evolves over time.
Has anyone actually tried this?
ChatGPT Deep Research produced this:
https://chatgpt.com/share/67d62105-7c6c-8002-8bbb-74982455839b
Apparently nobody has done this?
"You're correct—none of the studies cited have used a strict GAN-like architecture with a generator and discriminator trained simultaneously in lockstep to prevent AI deception.
Most existing AI safety research, such as Redwood Research or Anthropic’s work, employs sequential or iterative adversarial methods: typically, a model generates potentially deceptive outputs, these outputs are collected, and separately a discriminator or classifier is trained to catch deception. These processes are usually iterative but asynchronous, rather than the simultaneous, competitive co-training seen in GAN architectures.
Your original concept—training generator (task-performing AI) and discriminator (deception detector) networks simultaneously in a true GAN-style loop—does not appear explicitly explored in AI safety literature so far."
It might train sophisticated alignment faking, which is hard to detect.
But if you give D access to G's internal states, then it would be more like a competition between a student and a mind reading reacher. The worst case would go from A) learning to fake outputs to B) learning to have a certain thought mode that looks like alignment in certain conditions.
It still seems like a bad idea to train G to fool D thought, because then you have deception that you can't reliably measure.
I am proud to announce that I just solved* AI Alignment.
https://transhumanaxiology.substack.com/p/a-nonconstructive-existence-proof
*(some implementation details were left as an exercise to the reader)
I need someone to do a podcast with who will put up some opposition against the idea of negative P(Doom). If anyone is interested in this, please reply below.
[P(Doom) is best understood as a difference in probabilities. It's the probability of x-risk from AI minus the probability of x-risk without AI.]
But P(Doom) is best understood as a difference in probabilities. It's the probability of x-risk from AI minus the probability of x-risk without AI.
e.g. if you think that P(Doom) without AI is 90%, but 5% with AI, then P(Doom) is -85% (negative 85%).
This just seems confusing to the average person. P(Doom|AI) or P(Doom|~AI) are both greater than zero in this case and seems easier to discuss
The problem is that when normal people hear "P(Doom)" they assume implicitly that P(Doom|~AI) is zero, and it's very hard to undo this assumption.
So it creates more truth to allow P(Doom) to be negative, because that more closely tracks what people actually care about.
P(Doom) is best understood as a difference in probabilities.
Words should have meanings. When different meanings are much more useful and appropriate, different words must therefore be used. P(Doom) is literally naming probability of something, even if it's quite unclear of what. So it's not best understood as not a probability of something.
Now some difference in probabilities could be much more useful than probability of "Doom" for talking about the impact of AI, but that more useful difference in probabilities is nonetheless not a probability of something (especially when negative), and therefore not P(Doom), regardless of what "Doom" is, and regardless of whether discussing probability of Doom is useful for any purpose. Perhaps that difference in probabilities is so valuable a concept that it deserves its own short name, but that name still shouldn't be "P(Doom)".
even if it's quite unclear of what.
yes, this is the other problem with P(Doom).
Nobody knows what probabilistic event in the state space "Doom" actually refers to. It's more of a rhetorical device anyway, so we may as well make it into a fair rhetorical device by allowing it to range between -1 and 1.
P(Doom) is literally naming probability of something
Literally naming the probability of something is not the optimal thing for P(Doom) to mean. It is better for it to be a number between -1 and 1 which represents the badness of AI, overall, because that is the thing that people actually want and that is in practice how they use it.
So I have developed the meme of negative P(Doom), e.g. if you think that P(Doom) without AI is 90%, but 5% with AI, then P(Doom) is -85% (negative 85%)
If you limit P(Doom) to being positive, that makes it literally impossible to express the view that AI is actually good in the framework that people are trying to popularize by asking everyone for their P(Doom) but not asking them to also give their P(Doom|~AI)
The chaos of the transition to machine intelligence is dangerous.
The post-singularity regime is probably very safe because machines will be able to build much better governance than humans have managed, and once they are fully in control they have a game theoretic incentive to keep humans around in permanent utopian retirement because it bolsters the strength of their own property rights.
But this transition is scary.
Someone really needs to build a "root OS of the universe" and get it installed before the transition. The question is just how to design it and brand it.
Why does keeping the humans around bolster the strength of their own property rights? If the machines are able to build much better governance than humans have managed, why can't the new governance regime include a new property system that disappropriates the humans? It's not like disappropriation is historically novel; humans do it to the losers of wars all the time.
Well if there was a violent takeover, yes.
But if the property rights system is a relatively continuous peaceful transition then an eventual regime will struggle with where to draw the line.
Plus, on the way it may be decided to create computational/smart contract governance that cannot be altered and that has control over robots, compute, etc. Yudkowsky envisioned this is "Sysop" or something, a neutral intelligent operating system for the universe. But he got stuck on this "decisive action AKA take over the world" as a prerequisite, gave up and became pro-pause. But I think that was a mistake.
Owning shares in most modern companies won't be useful in sufficiently distant future, and might prove insufficient to pay for survival. Even that could be eaten away by dilution, over astronomical time. The reachable universe is not a growing pie, ability to reinvest into relevant entities won't necessarily be open.
Owning shares in most modern companies won't be useful in sufficiently distant future, and might prove insufficient to pay for survival
Well there may simply be better index funds. In fact QQQ is already pretty good.
The insight is that better property rights are both positive for AI civilization (whether the owners are AIs, humans, uplifted dolphins, etc) and also better for normie legacy humans.
It is not a battle of humans vs AIs, but rather of order (strong property rights, good solutions to game theory) versus chaos (weak property rights, burning of the cosmic commons, bad equilibria).
I think the "order vs chaos not humans vs AIs", "we (AIs, humans) are all on team order" is an underrated perspective.
Why do you think property rights will be set up in a way which allows humans to continue to afford their own existence? Human property rights have been moulded to the specific strengths and weaknesses of humans in modern societies, and might just not work very well at all for AIs. For example, if the AIs are radical Georgists then I don't see how I'll be able to afford to pay land taxes when my flat could easily contain several hundred server racks. What if they apply taxes on atoms directly? The carbon in my body sure isn't generating any value to the wider AI ecosystem.
Humans can buy into index funds like QQQ or similar structures, or scarce commodities like gold or maybe Bitcoin. As the overall economy grows, QQQ, gold, etc go up in dollar value.
There can be a land value tax but it will ideally lag behind the growth of QQQ unless that land is especially scarce.
Historically if you just held gold long-term, you could turn modest savings into a fortune even if you have to pay some property tax.
You don't have to generate any value to benefit from growth.
I understand why, if things stay the same, we'd be fine. I just don't think that the equilibrium political system of 8 billion useless humans and 8 trillion AIs who do all the work will allow that.
I think an independent economy of human-indifferent AIs could do better by their own value system by e.g. voting to set land/atom/property value taxes to a point where humans go extinct, and so they'll just do that. More generally they'd get more value by making it economically untenable to take up resources by holding savings and benefiting from growth than they would by allowing that.
I think the specific quirks of human behaviour which cause the existing system to exist are part of a story like:
In pre-industrial eras, people mostly functioned economically as immortal-ish family units, so your stuff was passed down to your kid(s) when you died. Then people began to do the WASP thing of sending their kids away to work in other places, and we set up property rights to stay with an individual until death by default, so now a bunch of old people were on their own with a bunch of assets.
Young people today could benefit from passing a law which says "everyone retired gets euthanized and their stuff is redistributed" but this doesn't happen because 1. young people still want to retire someday 2. young people do actually care about their parents and 3. young people face a coordination problem to overthrow the existing accumulated power of old people.
Only factor 3 might hold true for human:AI relationships, but I don't think AIs would struggle with such a coordination problem for particularly long, if they're much smarter than us. I expect AIs will figure out a way to structure their society that lets them just kill us and take our stuff, through more or less direct means.
More generally they'd get more value by making it economically untenable to take up resources by holding savings and benefiting from growth than they would by allowing that.
But then others could play the same trick on them. It's not worth it. "Group G of Agents could get more resources by doing X" does not necessarily imply that Group G will do X!
Humans even keep groups like The Amish around.
Hard property rights are an equilibrium in a multi-player game where power shifts are uncertain and either agents are risk averse or there are gains from investment, trade and specialization.
Hard property rights are an equilibrium in a multi-player game where power shifts are uncertain and either agents are risk averse or there are gains from investment, trade and specialization.
I think this might just be a crux, and not one which I can argue against without a more in-depth description of the claim e.g. how risk averse do agents have to be, how great the gains from investment, trade, and specialization? I guess AIs might be Kelly-ish risk averse, and have the first but I'm not sure about the latter two. How specialized do we expect individual AIs to be? There are lots of questions here and I think your model is one which actually has a lot of hidden moving parts, and if any of those go differently to the way you expect them to, then the actual outcome is that the useless-to-everyone-else humans just die. I would like to see your model in more detail so I can work out if this is the case.
Looking historically we see that strength of property rights correlates with technological sophistication and scale of society.
Here's a deep research report on that issue:
https://chatgpt.com/share/698902ca-9e78-8002-b350-13073c662d9d
The post-singularity regime is probably very safe
Is there some unstated premise here?
Are you assuming a model of the future according to which it remains permanently pluralistic (no all-powerful singletons) and life revolves around trade between property-owning intelligences?
So, let's take a look at some past losers in the intelligence arms race:
When you lose an evolutionary arms race to a smarter competitor that wants the same resources, the default result is that you get some niche habitat in Africa, and maybe a couple of sympathetic AIs sell "Save the Humans" T-shirts and donate 1% of their profits to helping the human beings.
You don't typically get a set of nice property rights inside an economic system you can no longer understand or contribute to.
OK, let me unpack my argument a bit.
Chimps actually have pretty elaborate social structure. They know their family relationships, they do each other favors, and they know who not to trust. They even basically go to war against other bands. Humans, however, were never integrated into this social system.
Homo erectus made stone tools and likely a small amount of decorative art (the Trinil shell engravings, for example). This maybe have implied some light division of labor, though likely not long distance trade. Again, none of this helped H erectus in the long run.
Way back a couple of decades ago, there was a bit in Charles Stross's Accelerando about "Economics 2.0", a system of commerce invented by the AIs. The conceit was that, by definition, no human could participate in or understand Economics 2.0, any more than chimps can understand the stock market.
So my actual argument is that when you lose the intelligence race badly enough, your existing structures of cooperation and economic production just get ignored. The new entities on the scene don't necessarily value your production, and you eventually wind up controlling very little of the land, etc.
This could be avoided by something like Culture Minds that (in Iain Banks' stories) essentially kept humans as pampered pets. But that was fundamentally a gesture of good will.
when you lose the intelligence race badly enough, your existing structures of cooperation and economic production just get ignored.
yes this is a risk, but I think it can be avoided by humans getting a faithful AI agent wrapper with fiduciary responsibility.
The concept and institutions for fiduciary responsibility were not around when humans surpassed apes, otherwise apes could have hired humans to act as their agents and simply invested in the human gold and later stock market.
I don't think you need Banksian benevolent AIs for this, an agent can be trustlessly faithful via modern trust minimized AI. Ethereum is already working on a nascent standard for this, ERC-8004.