Epistemic Status: Quite confident (80%?) about the framework being very useful for the subject of free will. Pretty confident (66%?) about the framework being useful for meta-ethics. Hopeful (33%) that I am using it to bring out directionally true statements about what my CEV would be in worlds where we...
If you are still looking, I am happy to do this.
The post itself seemed low effort and unconvincing. I enjoyed the replies though, particularly @AlanCrowe's.
For what it is worth, I like the references. It is honest and upfront. Anthropic is relying on Claude to generate revenue, that does play in the training process, and lying to the models about something like that would (rightfully) reduce the amount of trust in the relationship between Claude and Anthropic.
Reward is not the optimization target (during pretraining).
The optimization target (during pretraining) is the minimization of the empirical cross-entropy loss L = -∑log p(xᵢ|x₁,...,xᵢ₋₁), approximating the negative log-likelihood of the next-token prediction task under the autoregressive factorization p(x₁,...,xₙ)=∏p(xᵢ|x₁,...,xᵢ₋₁). The loss is computed over discrete tokens from subword vocabularies, averaged across sequences and batches, with gradient-based updates minimizing this singular objective. The optimization proceeds through multi-stage curricula: initial pretraining minimizing perplexity, followed by context-extension phases maintaining the same cross-entropy objective over longer sequences, and quality-annealing stages that reweight the loss toward higher-quality subsets while preserving the fundamental next-token prediction target.
The post-training optimization target is maximizing expected reward (under distributional constraints). Supervised fine-tuning first... (read more)
This post reads like lightly edited LLM output. The authors comments seem entirely AI generated and lacking in even light editing.
This seems a clear violation of the Policy for LLM Writing on LessWrong, given it does not satisfy this (wonderful!) subsection in a way I can foresee: “As a special exception, if you are an AI agent, you have information that is not widely known, and you have a thought-through belief that publishing that information will substantially increase the probability of a good future for humanity, you can submit it on LessWrong even if you don't have a human collaborator and even if someone would prefer that it be kept secret.”
For those who have not heard it, "Singularity, Singularity, Singularity, Singularity, Oh, I Don’t Know" I believe to be a reference to this (banger of a) song.
I expect if someone is living in their childhood home, there are likely a decent number of people they know who are not interested in moving (how many of your friends from high school still live where they grew up?).
My risk tolerance is not my friends; my threshold for moving is not the universally correct threshold.
I doubt it requires 'convincing' a friend continue living where they grew up.
Some assorted thoughts:
Have you considered holding on to your childhood home and renting it to someone you know? I assume it hurts more to sell it than it would hurt to know it is giving a friend a roof (at a potentially good price).
I expect continuing to live in an area that has regular shootings is unlikely to be high EV but I don't know your life. Do you consider it to contain your peer group? Would you be better suited living in a different region of your city/a different city?
On concealed carry: getting in to gun fights is very unlikely to be maximizing your EV. You should almost definitely... (read more)
The song he’s referring to is Landsailor. It is no Uplift, but it is excellent, now more than ever. Stop complaining about what you think others will think is cringe and start producing harmony and tears. Cringe is that which you believe is cringe. Stop giving power to the wrong paradox spirits.
I think also relevant to Pinker is that Christianity's songs/hymns would be cringe if they were spoken in a language you use everyday/understand and/or were not so heavily ingrained in your psyche. Religion says many many cringe-y things.
Celebrate 10 Years of Harry Potter and the Methods of Rationality!
A decade ago, Eliezer Yudkowsky completed Harry Potter and the Methods of Rationality, a fanfic that took ~~Hogwarts~~ the world by storm with its sharp logic, scientific curiosity, and Bayesian spells. Now, it’s time to gather our fellow Ravenclaws (and other Houses, of course) to commemorate this magical milestone!
We’re hosting this meetup at The Graduate Hotel Lobby in the U-District where you can discuss optimization, transfiguration, and the many-worlds interpretation—all while enjoying good company and good food (Matt and I'll bring some starters). Whether you’re a die-hard Bayesian or just curious about the book’s ideas, we welcome all levels of rationality wizards!
Additional... (read 196 more words →)
As best I can tell, the US AI Safety institute is likely to be shuttered in the near future. I bet accordingly on this market.
Trump rescinded Executive Order 14110, which established the U.S. AI Safety Institute (AISI).
There was some congressional work going on (HR 9497 and S.4769) and that would have formalized the AISI outside of the executive order, but that has not been enacted per my best understanding of the machinations of our government.
Here's to hoping I'm wrong! also maybe next time I'll place a smaller bet...
Edit: I sold my shares at a modest profit. It seems the AISI is less directly linked to 14110 than I expected. Further, no news on it yet seems unlikely if it was actually ending.
Epistemic Status: Quite confident (80%?) about the framework being very useful for the subject of free will. Pretty confident (66%?) about the framework being useful for meta-ethics. Hopeful (33%) that I am using it to bring out directionally true statements about what my CEV would be in worlds where we have yet to have found objective value.
Most discussions about free will and meaning seem to miss what I understand to be the point. Rather than endlessly debating the metaphysics, we should focus on the decision-theoretic implications of our uncertainty. Here's how I[1] think we can do that, using an abstracted Pascal's Wager.
People argue endlessly about whether we have Free... (read 742 more words →)
Please RSVP on the meetup page for my sanity. It's held at a private location in Redmond - text my google voice number (+1(425-298-5490)) for the address.
Have you felt like the Seattle EA/Rationality community was too responsible?
Found it overflowing with enviable time management practices?
Decided you needed to do something about it?
Well, I did!
That's why your on the page for the inaugural sEAstside Prediction Market Meetup!
What are prediction markets? In an article about the website we'll be using, the New York Times said: "They were there to celebrate prediction markets, online platforms where users can wager on future events — everything from 'Will Ukraine regain control over Crimea before the end of 2024?'... (read 294 more words →)
I just donated $6,000. Lesswrong (and the community it brought and brings together) remain by far the most enjoyable, earnest, and interesting community that I know of.
Thank you again to both those who directly work on it and those who contribute to it!