LESSWRONG
LW

Linda Linsefors
2677Ω383483532
Message
Dialogue
Subscribe

Hi, I am a Physicist, an Effective Altruist and AI Safety student/researcher.

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
3Linda Linsefors's Shortform
Ω
6y
Ω
105
the gears to ascenscion's Shortform
Linda Linsefors3d20

Typo react from me. I think you should call your links something informative. If you think the title of the post is clickbate, you can re-title it something better maybe?

Now I have to click to find out what the link is even about, which is also click-bate-y.

Reply1
Linda Linsefors's Shortform
Linda Linsefors3dΩ240

Estimated MSE loss for three diffrent ways of embedding features into neuons, when there are more possible features than neurons.

I've typed up some math notes for how much MSE loss we should expect for random embedings, and some other alternative embedings, for when you have more features than neurons. I don't have a good sense for how ledgeble this is to anyone but me.

Note that neither of these embedings are optimal. I belive that the optimal embeding for minimising MSE loss is to store the features in almost orthogonal directions, which is similar to random embedings but can be optimised more. But I also belive that MSE loss don't prefeer this solution very much, which means that when there are other tradeofs, MSE loss might not be enough to insentivise superposstion. 

This does not mean we should not expect superpossition in real network.

  1. Many networks uses other loss functions, e.g. cross-entropy.
  2. Even if the loss is MSE on the final output, this does not mean MSE is the right loss for modeling the dynamics in the middle of the network.

 

Setup and notation

  • T features
  • D neurons
  • z active featrues

Assuming: 

  • z≪D<T

True feature values:

  • y = 1               for active featrus
  • y = 0               for inactive features

 

Using random embedding directions (superpossition)

Estimated values:

  • ^y=a + ϵ        where      E[ϵ2]=(z−1)a2/D          for active features
  • ^y=ϵ               where      E[ϵ2]=za2/D                    for active features

Total Mean Squared Error (MSE)

MSErand=z((1−a)2+(z−1)a2D)+(T−z)za2D≈z(1−a)2+zTDa2

This is minimised by 

a=DT+D

Making MSE

MSErand=zTT+D=z(1−DT+D)

 

One feature per neuron

We emebd a single feature in each neuron, and the rest of the features, are just not represented.

Estimated values:

  • ^y=y              for represented features
  • ^y=0              for non represented features

Total Mean Squared Error (MSE)

MSEsingle=zT−DD

 

One neuron per feature

We embed each feature in a single neuron.

  • ^y=a∑y              where the sum is over all feature that shares the same neuron

We assume that the probability of co-activated features on the same neuron is small enough to ignore. We also assume that every neuron is used at least once. Then for any active neuron, the expected number of inactive neurons that will be wrongfully activated, are T−DD, giving us the MSE loss for this case as

MSEmulti=z((1−a)2+(TD−1)a2)

We can already see that this is smaller than MSErand, but let's also calculate what the minimum value is. MSErand is minimised by

a=DT

Making MSE

MSErand=z(1−DT)
Reply
Debugging for Mid Coders
Linda Linsefors17d20

I'm not sure that answered your question, but maybe you can ask a more specific one now.

 

The thing I was after was, what is the actual concreet causal chain from rationality training to you getting better at debuging. 

I currently think the answer is that the rationality training made you motivated, and that was the missing part that stopped you from getting better before. Let me know if you think I'm missing something important.

Reply
Debugging for Mid Coders
Linda Linsefors17d20

Interesting. Reading your comment makes me notice that I'm more motivated to learn object level skills than meta level skills. 

"meta level" != "rationality. 
E.g. I would count most of the CFAR curiculum as object level skills. But the traingin you're working on seems more meta level skills. 

I expect motivation to be super central for what leanring methods works. There has been a number of posts on ACX about school (including 2 that are part of the reveiw contest). The common theme is that the main bottleneck is students motivation.

Reply
Debugging for Mid Coders
Linda Linsefors17d20

I didn't improve much at debugging until I got generally serious about rationality training.

 

Can you expand on this please?

Reply
A Self-Dialogue on The Value Proposition of Romantic Relationships
Linda Linsefors17d40

NSFW question

How do you maintain breath control on someone who is paniking. 

I've tried a bit of hoding someones mouth and nose, from both sides of the experience, and haven't figured out a way that acctually stops the person from breathing if they try hard enough.

Reply
A Self-Dialogue on The Value Proposition of Romantic Relationships
Linda Linsefors20d40

No, I don't think what you say maches my experience. My anxiety was pointing straight at the thing I needed. Although I acknolage I did not put forward enough details for thus to be clear to you. 

But it did not tell me how much I would need exactly. So it's more like your hungry, and you eat some, and notice that you're still hungry, and then start to wonder if eating is actually what you need, or this hunger feeling is about something else. 

I don't know what you mean by "generic safery net" or "safety in the literal sense". I assumed based on context that we're not talking about physical safety. 

I mean things like: I'm not lonely and I expect to continue not to be lonely, because I found people I like who reliably also want me around.

Reply1
A Self-Dialogue on The Value Proposition of Romantic Relationships
Linda Linsefors21d50

I don't know what is true for the typical person, and I'm definatly not a typical person.

With those caviats, what you describe is not true for me. To feel ok, I need to have a handfull of close friends that I see regularly. This provides some sort of validation, among other values. If I have this, my social anxiety is low. If I don't have this my anxiety is high, and causes lot's of problems.

It might look like my anxiety was recistant to be cured by more safety, because it took me a long time to find the people I need. Before I found people of my approximat neurotype, I was so far from being ok, that it was unclear to me that the thing I could clearly feel I was missing, was something that could exist. 

And it's not the case that the further from the safe situation I am, the more anxiety I feel. It's more like a step function. 

Also, sometimes the anxiety need some time to fully update on a new situation. This looks like the anxiety comming back. And then I focus on the evidence that things are acctually ok, or ask for some help to do this, and then it goes away. This does not work if things are not acctually ok.

I can see how this could look like anxiety is conserved, over a lot of diffrent datapoints, and I don't know how someone can tell the diffrence untill they have experienced sufficient safety.

Reply1
A Self-Dialogue on The Value Proposition of Romantic Relationships
Linda Linsefors21d62

Maybe the reason people stick to what they are good at, is not lack of motivation to explore, but lack of safety net to explore. This seems to explain all your observations, if you assume most people are much more anxious than you. In this case, what other people need to grow is more acceptance in their life, not more pushing.

Reply
A Self-Dialogue on The Value Proposition of Romantic Relationships
Linda Linsefors21d50

I disagree that it's hard, in the relevant context.

It's hard to communicate this to someone who don't have a distinction between the two concepts in their head. It's also hard to communicate this with someone who are two quick to jump to conclutions regarding what you mean to say, and also have bad priors about you. This is enough of a problem, that I don't recommend offering decernments to people you don't know well. But that's also kind of a mute point, since I think it's bad to offer unsolicited advice to people you don't know well, for other reasons.

But with someone like a romantic parner, or a close friend, with whom you'd have lots of long form conversation, I don't think it's hard. 

You can infact just say: "I love you as you are, and among the things I love about you is the desire to grow stronger. I've noticed a way you could be stronger, do you want to hear it now or later?"

Or if you have extablished the words "desernment vs judgment" you can just pre-prease any suggestion for imporvment with "desernment". Or what ever communication style works for you.

Later into the relationship, you might not even have to clarify, but the person will just have the correct prior that you're expressing a desernment, and not a judgment.

Reply
Load More
Outer Alignment
2y
(+9/-80)
Inner Alignment
2y
(+13/-84)
72Circuits in Superposition 2: Now with Less Wrong Math
Ω
2mo
Ω
0
25Is the output of the softmax in a single transformer attention head usually winner-takes-all?
Q
7mo
Q
1
36Theory of Change for AI Safety Camp
7mo
3
36We don't want to post again "This might be the last AI Safety Camp"
7mo
17
60Funding Case: AI Safety Camp 11
8mo
4
38AI Safety Camp 10
Ω
10mo
Ω
9
17Invitation to lead a project at AI Safety Camp (Virtual Edition, 2025)
Ω
1y
Ω
2
75AISC9 has ended and there will be an AISC10
Ω
1y
Ω
4
46Some costs of superposition
Ω
2y
Ω
11
196This might be the last AI Safety Camp
2y
34
Load More