Gurkenglas

I operate by Crocker's rules.

I won't deliberately, derisively spread something just because you tried to point out an infohazard.

Comments

Visualizing in 5 dimensions

One trick I thought of for thinking about high-dimensional spaces is to put multiple dimensions on the same axis: Consider the vectors in R² from the origin onto the unit circle. Lengthen each into a axis, each going infinitely forward and backward, each sharing all its points of R² with one other, all of them intersecting at 0. Embed this in R³, then continuously rotate the tip of each axis into the new dimension, forming a double cone centered at 0. Rotate them further until all tips touch, forming a single axis that contains the information of two dimensions.

You can now have an axis contain the information of any R-vector space, and visualize up to 3 at a time. Of course, not all mental operations that worked in R³ still work.

The Apprentice Thread

[APPRENTICE] Advanced math or AI alignment. I'm bad at getting homework done and good at grokking things quickly, so the day-to-day should look like pair programming or tutoring.

[MENTOR] See https://www.lesswrong.com/posts/MHqwi8kzwaWD8wEQc/would-you-like-me-to-debug-your-math. The first session has the highest leverage, but if my calendar doesn't end up booked (there is one slot in the next two weeks booked out of like 50), more time per person makes sense. My specialization is pattern-matching to correctly predict where a piece of math is going if it's good. When you science that art you get applied category theory.

Experiments with a random clock

Break the minute hand off your wristwatch. Maybe some of the hour hand too.

Would you like me to debug your math?

Seems like less of a market niche, but link it!

Would you like me to debug your math?

It's not my one trick, of course, but it illustrates my usefulness. It's more maintainable not just because it is shorter but also because it has decades of theory behind it. Drawing the connection unlocks inspiration from entire branches of math. And the speedups from standing on the shoulders of giants go far beyond the constant factors from vectorized instructions.

Finite Factored Sets: Orthogonality and Time

subpartitions

So you're doing category theory after all! :)

Would you like me to debug your math?

Sure, I'll try it. I don't expect to be an order-of-magnitude power multiplier in that case, though.

The reverse Goodhart problem

I wouldn't relate it to humans. In just about any basic machine learning setting, (train, test) has aspect 2. In fact, what you describe speaks in favor of modeling preferences using something other than utility functions, where aspect 3 is ruled out.

re your natural example, I would expect that as one shifts from 90% doing the best for the worst off + 10% attainable utility preservation to 100% the former, average welfare goes down.

Speculations against GPT-n writing alignment papers

Take into account that the AI that interprets needs not be the same as the network being interpreted.

Why do you think that a mere autocomplete engine could not do interpretability work? It has been demonstrated to write comments for code and code for specs.

Speculations against GPT-n writing alignment papers

The error correction needs to be present in the original network because I also do some of the converting network into english. The only reason I don't do everything myself is that it takes too long. The proportion can be higher at the topmost levels because there are less tasks there. The error correction doesn't let it completely ignore what I would do at the low levels because on the 1% I can compare its outputs to mine, so they need to at least superficially look similar.

If we find that there's a bunch of redundancy, we can check whether there is any way to cut it down that would suddenly identify a bunch of mesa optimization. So the mesaoptimizer would have to take over the network entirely or trick its bretheren as well as us.

Load More