Epistemic – this post is more suitable for LW as it was 10 years ago

Thought experiment with curing a disease by forgetting

Imagine I have a bad but rare disease X. I may try to escape it in the following way:

1. I enter the blank state of mind and forget that I had X.

2. Now I in some sense merge with a very large number of my (semi)copies in parallel worlds who do the same. I will be in the same state of mind as other my copies, some of them have disease X, but most don’t.

3. Now I can use self-sampling assumption for observer-moments (Strong SSA) and think that I am randomly selected from all these exactly the same observer-moments.

4. Based on this, the chances that my next observer-moment after...

(Continue Reading – 1099 more words)

1ABlue14h

Is this an independent reinvention of the law of attraction? There doesn't seem to be anything special about "stop having a disease by forgetting about it" compared to the general "be in a universe by adopting a mental state compatible with that universe." That said, becoming completely convinced I'm a billionaire seems more psychologically involved than forgetting I have some disease, and the ratio of universes where I'm a billionaire versus I've deluded myself into thinking I'm a billionaire seems less favorable as well. Anyway, this doesn't seem like a good solution since even for every "me" that gets into a better universe, another just gets booted into the worse one. As far as the interests of the whole cohort go it'd be a waste of effort.

4Donald Hobson18h

The point is, if all the robots are a true blank state, then none of them is you. Because your entire personality has just been forgotten.

avturchinnow20

I can forget one particular thing, but preserve most of my selfidentification information

"Why I Write" by George Orwell (1946)

Arjun Panickssery

This is a linkpost for https://www.orwellfoundation.com/the-orwell-foundation/orwell/essays-and-other-works/why-i-write/

People have been posting great essays so that they're "fed through the standard LessWrong algorithm." This essay is in the public domain in the UK but not the US.

From a very early age, perhaps the age of five or six, I knew that when I grew up I should be a writer. Between the ages of about seventeen and twenty-four I tried to abandon this idea, but I did so with the consciousness that I was outraging my true nature and that sooner or later I should have to settle down and write books.

I was the middle child of three, but there was a gap of five years on either side, and I barely saw my father before I was eight. For this and other reasons I...

(Continue Reading – 2660 more words)

The Best Tacit Knowledge Videos on Every Subject

302

Parker Conley, hans truman

25d

TL;DR

Tacit knowledge is extremely valuable. Unfortunately, developing tacit knowledge is usually bottlenecked by apprentice-master relationships. Tacit Knowledge Videos could widen this bottleneck. This post is a Schelling point for aggregating these videos—aiming to be The Best Textbooks on Every Subject for Tacit Knowledge Videos. Scroll down to the list if that's what you're here for. Post videos that highlight tacit knowledge in the comments and I’ll add them to the post. Experts in the videos include Stephen Wolfram, Holden Karnofsky, Andy Matuschak, Jonathan Blow, Tyler Cowen, George Hotz, and others.

What are Tacit Knowledge Videos?

Samo Burja claims YouTube has opened the gates for a revolution in tacit knowledge transfer. Burja defines tacit knowledge as follows:

Tacit knowledge is knowledge that can’t properly be transmitted via verbal or written instruction, like the ability to create

...

(Continue Reading – 3727 more words)

quailia8h10

Networking, Relationship building, both professional and personal, I'm sure there are overlaps. And echoing another request: Sales

Difference between European and US healthcare systems [discussion post]

Nathan Young

The discussion is:

The difference between EU and US healthcare systems

The rules are:

A limited group of users will discuss the issue
Participants will only make arguments they endorse
If you receive a link to this without being told you are discussing, then you are an "other user"
Other users can reply to the e comment reserved for that purpose. Otherwise their comments will be deleted
Other users can vote or emote on any comment

Nathan Young43m20

This comment may be replied to by anyone.

Other comments are for the discussion group only.

Examples of Highly Counterfactual Discoveries?

133

johnswentworth, kromem

The history of science has tons of examples of the same thing being discovered multiple time independently; wikipedia has a whole list of examples here. If your goal in studying the history of science is to extract the predictable/overdetermined component of humanity's trajectory, then it makes sense to focus on such examples.

But if your goal is to achieve high counterfactual impact in your own research, then you should probably draw inspiration from the opposite: "singular" discoveries, i.e. discoveries which nobody else was anywhere close to figuring out. After all, if someone else would have figured it out shortly after anyways, then the discovery probably wasn't very counterfactually impactful.

Alas, nobody seems to have made a list of highly counterfactual scientific discoveries, to complement wikipedia's list of multiple discoveries.

To...

(See More – 189 more words)

3Leon Lang2h

I guess (but don't know) that most people who downvote Garrett's comment overupdated on intuitive explanations of singular learning theory, not realizing that entire books with novel and nontrivial mathematical theory have been written on it.

2tailcalled2h

Newton's Universal Law of Gravitation was the first highly accurate model of things falling down that generalized beyond the earth, and it is also the second-most computationally applicable model of things falling down that we have today. Are you saying that singular learning theory was the first highly accurate model of breadth of optima, and that it's one of the most computationally applicable ones we have?

Alexander Gietelink Oldenziel25m20

Did I just say SLT is the Newtonian gravity of deep learning? Hubris of the highest order!

But also yes... I think I am saying that

Singular Learning Theory is the first highly accurate model of breath of optima.
- SLT tells us to look at a quantity Watanabe calls $λ$ , which has the highly-technical name 'real log canonical threshold. He proves several equivalent ways to describe it one of which is as the (fractal) volume scaling dimension around the optima
- The RLCT = $λ$ first-order term for in-distribution generalization error and also Bayesian lear

... (read more)

1cubefox4h

There is a large difference between sooner and later. Highly non-obvious ideas will be discovered later, not sooner. The fact that China didn't rediscover the theory in more than two thousand years means that it the ability to sail the ocean didn't make it obvious. As far as we know, nobody did, except for early Greece. There is some uncertainty about India, but these sources are dated later and from a time when there was already some contact with Greece, so they may have learned it from them.

Towards Monosemanticity: Decomposing Language Models With Dictionary Learning

286

Zac Hatfield-Dodds

Ω 1107mo

This is a linkpost for https://transformer-circuits.pub/2023/monosemantic-features/

Text of post based on our blog post as a linkpost for the full paper which is considerably longer and more detailed.

Neural networks are trained on data, not programmed to follow rules. We understand the math of the trained network exactly – each neuron in a neural network performs simple arithmetic – but we don't understand why those mathematical operations result in the behaviors we see. This makes it hard to diagnose failure modes, hard to know how to fix them, and hard to certify that a model is truly safe.

Luckily for those of us trying to understand artificial neural networks, we can simultaneously record the activation of every neuron in the network, intervene by silencing or stimulating them, and test the network's response to any possible...

(See More – 533 more words)

1Rosco-Hunter4h

This was a really interesting paper; however, I was left with one question. Can anyone argue why exactly the model is motivated to learn a much more complex function than the identity map? An auto-encoder whose latent space is much smaller than the input is forced to learn an interesting map; however, I can't see why a highly over-parameterised auto-encoder wouldn't simply learn something close to an identity map. Is it somehow the regularisation or the bias terms? I'd love to hear an argument for why the auto-encoder is likely to learn these mono-semantic features as opposed to an identity map.

Zac Hatfield-Dodds33m20

It's a sparse autoencoder because part of the loss function is an L1 penalty encouraging sparsity in the hidden layer. Otherwise, it would indeed learn a simple identity map!

To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)

Knowledge Base 8: The truth as an attractor in the information space

iwis

39m

Hypothesis: The truth, defined as the information with 100% credibility calculated by the HITS algorithm in accordance with the consensus theory of truth, is an attractor in the information space.

Interface between all intelligent agents

The post Knowledge Base 2: The structure and the method of building describes the possibility of building a knowledge database using crowdsourcing. After adding initial knowledge using crowdsourcing, also artificial intelligence and robots could add knowledge to this database during the execution of tasks, for example, while finding answers to the questions of their users or users of the database. People, computer programs, and robots could collaborate with each other to add and evaluate knowledge, increasing the amount and credibility of useful knowledge. Thus, this database could be a common interface for the exchange...

(See More – 507 more words)

Bayesian inference without priors

DanielFilan

16h

Epistemic status: party trick

Why remove the prior

One famed feature of Bayesian inference is that it involves prior probability distributions. Given an exhaustive collection of mutually exclusive ways the world could be (hereafter called ‘hypotheses’), one starts with a sense of how likely the world is to be described by each hypothesis, in the absence of any contingent relevant evidence. One then combines this prior with a likelihood distribution, which for each hypothesis gives the probability that one would see any particular set of evidence, to get a posterior distribution of how likely each hypothesis is to be true given observed evidence. The prior and the likelihood seem pretty different: the prior is looking at the probability of the hypotheses in question, whereas the likelihood is looking at...

(Continue Reading – 2092 more words)

2Razied6h

Most of the weird stuff involving priors comes into being when you want posteriors over a continuous hypothesis space, where you get in trouble because reparametrizing your space changes the form of your prior, so a uniform "natural" prior is really a particular choice of parametrization. Using a discrete hypothesis space avoids big parts of the problem.

Richard_Kennaway40m20

Using a discrete hypothesis space avoids big parts of the problem.

Only if there is a "natural" discretisation of the hypothesis space. It's fine for coin tosses and die rolls, but if the problem itself is continuous, different discretisations will give the same problems as different continuous parameterisations.

In general, when infinities naturally arise but cause problems, decreeing that everything must be finite does not solve those problems, and introduces problems of its own.

9Jsevillamol10h

I've been tempted to do this sometime, but I fear the prior is performing one very important role you are not making explicit: defining the universe of possible hypothesis you consider. In turn, defining that universe of probabilities defines how bayesian updates look like. Here is a problem that arises when you ignore this: https://www.lesswrong.com/posts/R28ppqby8zftndDAM/a-bayesian-aggregation-paradox

Thoughts on seed oil

244

dynomight

This is a linkpost for https://dynomight.net/seed-oil/

A friend has spent the last three years hounding me about seed oils. Every time I thought I was safe, he’d wait a couple months and renew his attack:

“When are you going to write about seed oils?”

“Did you know that seed oils are why there’s so much {obesity, heart disease, diabetes, inflammation, cancer, dementia}?”

“Why did you write about {meth, the death penalty, consciousness, nukes, ethylene, abortion, AI, aliens, colonoscopies, Tunnel Man, Bourdieu, Assange} when you could have written about seed oils?”

“Isn’t it time to quit your silly navel-gazing and use your weird obsessive personality to make a dent in the world—by writing about seed oils?”

He’d often send screenshots of people reminding each other that Corn Oil is Murder and that it’s critical that we overturn our lives...

(Continue Reading – 4926 more words)

1Slapstick1h

I would consider most bread sold in stores to be processed or ultra processed and I think that's a pretty standard view but it's true there might be some confusion. I would consider all of those to be processed and unhealthy and I think thats a pretty standard view, but fair enough if there's some confusion around those things. I guess my view is that it's mostly not hogwash? The least healthy things are clearly and broadly much more processed than the healthiest things.

1Slapstick1h

I typically consume my greens with ground flax seeds in a smoothie. I feel very confident that adding refined oil to vegetables shouldn't be considered healthy, in the sense that the opportunity cost of 1 Tablespoon of olive oil is 120 calories, which is over a pound of spinach for example. Certainly it's difficult to eat that much spinach and it's probably unwise, but I just say that to illustrate that you can get a lot more nutrition from 120 calories than the oil will be adding, even if it makes the greens more bioavailable. That said "healthy" is a complicated concept. If adding some oil to greens helps something eat greens they otherwise wouldn't eat for example, that's great.

Ann1h10

Raw spinach in particular also has high levels of oxalic acid, which can interfere with the absorption of other nutrients, and cause kidney stones when binding with calcium. Processing it by cooking can reduce its concentration and impact significantly without reducing other nutrients in the spinach as much.

Grinding and blending foods is itself processing. I don't know what impact it has on nutrition, but mechanically speaking, you can imagine digestion proceeding differently depending on how much of it has already been done.

You do need a certain amount of... (read more)

1Slapstick2h

I am perhaps not speaking as precisely as I should be. I appreciate your comments. I believe it's correct to say that if you consider all of the food/energy we consumed in the past 50+ million years, it's virtually all plants. The past 2-2.5 million years had us introducing more animal products to greater or lesser extents. Some were able to subsist on mostly animal products. Some consumed them very rarely. In that sense it is a relatively recent introduction. My main point is that given our evolutionary history, the idea that plants would be healthier for us than animal products when we have both in abundance, and the idea that plants are more suitable to maintaining health long past reproductive age, aren't immediately/obviously unreasonable ideas.

LESSWRONG
LW

Quick Takes

Popular Comments

Recent Discussion