Software Engineer at Microsoft
Update: now that Vision Pro is out, would you consider that to meet your definition of "Transformative VR"?
But, of course, these two challenges were completely toy. Future challenges and benchmarks should not be.
I am confused. I imagine that there would still be uses for toy problems in future challenges and benchmarks. Of course, we don’t want to have exclusively toy problems, but I am reading this as advocating for the other extreme without providing adequate support for why, though I may have misunderstood. My defense of toy problems is that they are more broadly accessible, require less investment to iterate on, and allow us to isolate one specific portion of the difficulty, enabling progress to be made in one step, instead of needing to decompose and solve multiple subproblems. We can always discard those toy solutions that do not scale to larger models.
In particular, toy problems are especially suitable as a playground for novel approaches that are not yet mature. These usually are not initially performant enough to justify allocating substantial resources towards but may hold promise eventually once the kinks are ironed out. With a robust set of standard toy problems, we can determine which of these new procedures may be worth further investigation and refinement. This is especially important in a pre-paradigmatic field like mechanistic interpretability, where we may (as an analogy) be in a geocentric era waiting for heliocentrism to be invented.
Nit: I don’t consider polymorphic malware to be that advanced. I made some as a university project. It is essentially automated refactoring. All you need to do is replace sections of a binary with other functionally equivalent sections without breaking it, optionally adding some optimization so that the new variant is classified as benign.
Update: during your interview for a B1/B2 visa, be sure to emphasize the "training" aspect of SERI MATS above the "research" aspect. Got told to submit J1 documents so now I need the organizers to give me a DS 2019.
At least I don't need to reinterview.
Future of compute review - submission of evidence
Prepared by:
Concretely: I think we're 6 months from the crossover point
Now that it's been 6 months since you got your Meta Quest Pro, how has it held up? Also, what are your predictions for Apple's VR headset, which is rumored to release next month?
There's an effect that works in the opposite direction where you lower the hiring bar as headcount scales. Key early hires may have a more stringent filter applied to them than later additions. But the bar can still be arbitrarily high, look at the profiles of people who are joining recently, e.g Leaving Wave, joining Anthropic | benkuhn.net
It's important to be clear about what the goal is: if it's the instrumental careerist goal "increase status to maximize the probability of joining a prestigious organization", then that strategy may look very different from the terminal scientist goal of "reduce x-risk by doing technical AGI alignment work". The former seems much more competitive than the latter.
The following part will sound a little self-helpy, but hopefully it'll be useful:
Concrete suggestion: this weekend, execute on some small tasks which satisfy the following constraints:
Find the tasks in your notes after a period of physical exertion. Avoid searching the internet or digging deeply into your mind (anything you can characterize as paying constant attention to filtered noise to mitigate the risk that some decision relevant information managed to slip past your cognitive systems). Decline anything that spurs an instinct of anxious perfectionism. Understand where you are first and marginally shift towards your desired position.
You sound like someone who has a far larger max step size than ordinary people. You have the ability to get to places by making one big leap. But go to this simulation Why Momentum Really Works (distill.pub) and fix momentum at 0.99. What happens to the solution as you gradually move the step size slider to the right?
Chaotic divergence and oscillation.
Selling your startup to get into Anthropic seems, with all due respect, to be a plan with step count = 1. Recall Expecting Short Inferential Distances. Practicing adaptive dampening would let you more reliably plan and follow routes requiring step count > 1. To be fair, I can kinda see where you're coming from, and logically it can be broken down into independent subcomponents that you work on in parallel, but the best advice I can concisely offer without more context on the details of your situation would be this:
"Learn to walk".