LESSWRONG
LW

246
Jeffrey Liang
34170
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
Childhood and Education: College Admissions
Jeffrey Liang2mo10

I don't think you understood me. I totally hate the system and wish it were different.

I was responding to the OP stating that it's not immoral to deceive on college applications!

Reply
Childhood and Education: College Admissions
Jeffrey Liang2mo3-4

I would disagree that it's okay to treat college applications like games of deception. Yes the system is incredibly stupid BUT it has a large real impact on your life and that's what makes the difference. It might make your life better or more comfortable but that's what makes it a relevant moral problem. And if you get tempted by that, you'll probably be tempted by those high-paying zero-sum/exploitative careers post-graduation and then congrats -- you've sold out.

Besides, a country that lets a system which selects new elites based on vice rather than merit persist deserves the decay that comes with that.

Finally, I'm not sure if I'm extrapolating too much from my own experience, but I feel like if you're really competent you can do great regardless of elite uni acceptance. Most of the demand is from those seeking high-paying and/or comfortable zero-sum/exploitative careers, which you shouldn't want anyway.

Reply
The Croissant Principle: A Theory of AI Generalization
Jeffrey Liang3mo10

Yeah I was originally envisioning this as an ML theory paper which is why it's math-heavy and doesn't have experiments. Tbh, as far as I understand, my paper is far more useful than most ML theory papers because it actually engages with empirical phenomena people care about and provides reasonable testable explanations.

Ha, I think some rando saying "hey I have plausible explanations for two mysterious regularities in ML via this theoretical framework but I could be wrong" is way more attention-worthy than another "I proved RH in 1 page!" or "I built ASI in my garage!"

Mmm, I know how to do "good" research. I just don't think it's a "good" use of my time. I honestly don't think adding citations and a lit review will help anybody nearly as much as working on other ideas.

PS: Just because someone doesn't flash their credentials, doesn't mean they don't have stellar credentials ;)

Reply
The Croissant Principle: A Theory of AI Generalization
Jeffrey Liang3mo10

Oh yes I do know math lol. Yeah the summary above hits most of the main ideas if you're not too familiar with pure math.

Reply
The Croissant Principle: A Theory of AI Generalization
Jeffrey Liang3mo10

Thanks interesting! I had not read this paper before.

Some initial thoughts:

  1. Very cool and satisfying that all these scaling laws might emerge from metric space geometry (i.e. dimensionality).
  2. Main differences seem to be: they tackle model scaling, their data manifold is a product of the model while our latent space is a property of the data and its generating process itself, and they provide empirical evidence.
  3. They note that model scaling seems to be pretty independent of architecture. I wonder if the relevant model scaling law in most cases is more similar to our model where it's a property of the data before being processed by the model.
  4. I might get around to running empirical experiments for this, though I'm pretty busy trying out all my other ideas heh. Would definitely welcome work from others on this! The way I was thinking about testing this was to set up a synthetic regression dataset where you explicitly generate data from a latent space and see how loss scales as you increase data.
Reply
Discontinuous Linear Functions?!
Jeffrey Liang3mo40

Perhaps! I'm not familiar with extended norms. But when you say "let's put the uniform norm on C1(R)" warning bells start going off in my head 😅

Reply
Discontinuous Linear Functions?!
Jeffrey Liang3mo123

Okay I took the nerd bait and signed up for LW to say:

For your example to work you need to restrict the domain of your functions to some compact e.g. C1([0,1])because the uniform norm requires the functions to be bounded.

Also note this example works because you're not using the "usual" topology on C1([0,1]) which also includes the uniform norm of the derivative and makes the space complete. It is much more difficult if the domain is complete!

Reply
20The Croissant Principle: A Theory of AI Generalization
3mo
6