LESSWRONG
LW

Niclas Kupper
534120
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
Meditations on Doge
Niclas Kupper2mo30

I’ve lived near DC for about 40 years of my life. I haven’t seen anyone succeed with regulatory reforms.

 

I thought people considered National Partnership for Reinventing Government (NPR) to have been generally successful. I think the important difference to DOGE is the goals of the NPR: "work better, cost less, and get results Americans care about". DOGEs only KPI is seemingly the nominal dollar amount that was cut with no regard for impact or trade-offs. 

Reply
The Bell Curve of Bad Behavior
Niclas Kupper3mo40

I think the dojo analogy is very good and useful. Some unstructured thoughts: It gets at a core feature of humans is being able to adjust our personalities based on context. I suspect there is a semi-stable equilibrium thing that is important. This is a big reason people underestimate company/community culture: it can give some amount of herd immunity to bad behavior. If sufficiently many "defect" the culture changes. This is also an issue as communities grow of course, policing is harder and nuances of behavior get lost. 

Reply
One-shot steering vectors cause emergent misalignment, too
Niclas Kupper3mo30

Great work! I don't know a lot about steering vectors and have some questions. I am also happy for you to just send me to a different resource.

1) From my understanding steering vectors are just things you add to the activations. At what layer do you do this?

2) You write "we optimized four different “harmful code” vectors". How did you "combine" the resulting four different vectors?

3) I would also be interested in how similar these vectors are to each other, and how similar they are to the refusal vector.

Reply
Poll on AI opinions.
Niclas Kupper4mo10

That is fair, I should have probably left some seed statements regarding the definition of AGI / ASI.
EDIT: I have added additional statements.

Reply
AI #99: Farewell to Biden
Niclas Kupper6mo10

I just want to say that Amazon is fairly close to a universal recommendation app!

Reply
Transformers Represent Belief State Geometry in their Residual Stream
Niclas Kupper1y10

Where can I read about this 2-state HMM? By learn I just mean approximate via an algorithm. The UAT is not sufficient as it talks about learning a known function. Baum-Welch is such an algorithm, but as a far as I am aware it gives no guarantees on anything really.

Reply
Transformers Represent Belief State Geometry in their Residual Stream
Niclas Kupper1y10

Is there some theoretical result along the lines of "A sufficiently large transformer can learn any HMM"?

Reply
Examples of Highly Counterfactual Discoveries?
Niclas Kupper1y70

It would be interesting for people to post current research that they think has some small chance of outputting highly singular results!

Reply1
Examples of Highly Counterfactual Discoveries?
Answer by Niclas KupperApr 24, 202450

Grothendiek seems to have been an extremely singular researcher, various of his discoveries would have likely been significantly delayed without him. His work on sheafs is mind bending the first time you see it and was seemingly ahead of its time.

Reply
Feedbackloop-first Rationality
Niclas Kupper2y64

As someone who is currently getting a PhD in mathematics I wish I could use Lean. The main problem for me is that the area I work in hasn't been formalized in Lean yet. I tried for like a week, but didn't get very far... I only managed to implement the definition of Poisson point process (kinda). I concluded that it wasn't worth spending my time to create this feedback loop and I'd rather work based on vibes. 

I am jealous of the next generation of mathematicians that are forced to write down everything using formal verification. They will be better than the current generation.

Reply
Load More
1Poll on AI opinions.
5mo
2
18Poll Results on AGI
3y
0
12LessWrong Poll on AGI
3y
6
4Noisy environment regulate utility maximizers
3y
0