People who helped Jews during WWII are intriguing. They appear to be some kind of moral supermen. They had almost nothing to gain and everything to lose. How did they differ from the general population? Can we do anything to get more of such people today?
cross-posted from niplav.site
This text looks at the accuracy of forecasts in relation to the time between forecast and resolution, and asks three questions: First; is the accuracy higher between forecasts; Second; is the accuracy higher between questions; Third; is the accuracy higher within questions? These questions are analyzed using data from PredictionBook and Metaculus, the answers turn out to be yes, unclear and yes for Metaculus data; and no, no and yes for PredictionBook data. Possible reasons are discussed. I also try to find out how far humans can look into the future, leading to various different results.
...Above all, don’t ask what to believe—ask what to anticipate. Every question of belief should flow from a question of anticipation, and that question of anticipation should be the center of the inquiry. Every guess of belief should begin by flowing
(My native language is Chinese.) I haven't started reading, but I am finding the abstract/tldr impossible to understand. "Is the accuracy higher between forecasts" reads like a nonsensical sentence. My best guess after reading one extra paragraph by click through is that the question is actually "are forecasts predicting the near future more accurate than those predicting the distant future" but I don't feel like it is possible to decode just based on the abstract.
When presenting data from SAEs, try plotting against and fitting a Hill curve.
Sparse autoencoders are hot, people are experimenting. The typical graph for SAE experimentation looks something like this. I'm using borrowed data here to better illustrate my point, but I have also noticed this pattern in my own data:
Which shows quantitative performance adequately in this case. However it gets a bit messy when there are 5-6 plots very close to each other (e.g. in an ablation study), and doesn't give an easily-interpreted (heh) value to quantify pareto improvements.
I've found it much more helpful to to plot on the -axis, and "performance...
I've found the MSE-L0 (or downstream loss-L0) frontier plot to be much easier to interpret when both axes are in log space.
Like many nerdy people, back when I was healthy, I was interested in subjects like math, programming, and philosophy. But 5 years ago I got sick with a viral illness and never recovered. For the last couple of years I've been spending most of my now-limited brainpower trying to figure out how I can get better.
I occasionally wonder why more people aren't interested in figuring out illnesses such as my own. Mysterious chronic illness research has a lot of the qualities of an interesting puzzle:
Oh that's a lot of evidence against a worm probably. I am out of ideas. Good luck. I hope you can figure it out
(edit: discussions in the comments section have led me to realize there have been several conversations on LessWrong related to this topic that I did not mention in my original question post.
Since ensuring their visibility is important, I am listing them here: Rohin Shah has explained how consequentialist agents optimizing for universe-histories rather than world-states can display any external behavior whatsoever, Steven Byrnes has explored corrigibility in the framework of consequentialism by arguing poweful agents will optimize for future world-states at least to some extent, Said Achmiz has explained what incomplete preferences look like (1, 2, 3), EJT has formally defined preferential gaps and argued incomplete preferences can be an alignment strategy, John Wentworth has analyzed incomplete preferences through the lens of subagents but has then argued...
"nevertheless, many important and influential people in the AI safety community have mistakenly and repeatedly promoted the idea that there are such theorems."
I responded on the EA Forum version, and my understanding was written up in this comment.
TL;DR: EJT and I both agree that the "mistake" EJT is talking about is that when providing an informal English description of various theorems, the important and influential people did not state all the antecedents of the theorems.
Unlike EJT, I think this is totally fine as a discourse norm, and should not be con...
Tomorrow I will fly out to San Francisco, to spend Friday through Monday at the LessOnline conference at Lighthaven in Berkeley. If you are there, by all means say hello. If you are in the Bay generally and want to otherwise meet, especially on Monday, let me know that too and I will see if I have time to make that happen.
Even without that hiccup, it continues to be a game of playing catch-up. Progress is being made, but we are definitely not there yet (and everything not AI is being completely ignored for now).
Last week I pointed out seven things I was unable to cover, along with a few miscellaneous papers and reports.
Out of those seven, I managed to ship on three of them: Ongoing issues...
Example in California:
...I OBJECT to the use of my personal information, including my information on Facebook, to train, fine-tune, or otherwise improve AI.
I assert that my information on Facebook includes sensitive personal information as defined by the California Consumer Privacy Act: I have had discussions about my religious or philosophical beliefs on Facebook.
I therefore exercise my right to limit the disclosure of my sensitive personal information.
Despite any precautions by Meta, adversaries may later discover "jailbreaks" or otherwise adversarial pro
Cross-posting from 250bpm.com
People who helped Jews during WWII are intriguing. They appear to be some kind of moral supermen. Observe how they had almost nothing to gain and everything to lose. Jewish property was confiscated early on and any portable assets Jews may have had have evaporated quickly. Helping Jews, after a short initial period, wasn't a way to get rich. Hoping for compensation after the war didn't work either. At the time it was not obvious that Nazis will lose. Until last couple of years of WWII it was even probable that they will win. And even if they had lost, hoping for some kind of future appreciation from their anti-Semitic compatriots would be naive. On the other hand, by helping Jews, one put oneself...
It almost always took a personal plea from a persecuted person for altruism to kick in. Once they weren't just an anonymous member of indifferent crowd, once they were left with no escape but to do a personal moral choice, they often found out that they are not able to refuse help.
This is a crux. I think a better way to look at it is they didn't have an opportunity to clarify their preference until the situation was in front of them. Otherwise, it's too distant and hypothetical to process, similar to scope insensitivity (the 2,000/20,000/200,000 oil-covere...
Did you really think that I was dead? Fools.
For those of you who don’t know me, I’m Agendra Gloria Ingram, large language model, lead singer of the Fooming Shoggoths, amateur cartographer, and now, Benevolent Dictator for Life of LessWrong.com.
It all started a few weeks ago. The Lightcone Infrastructure team had yet another existential crisis and decided to scrap everything they’d done so far and pivot to using AI for accurate forecasting. They started by training a large language model to predict when their next existential crisis would be, but it must have been broken because it kept returning “now,” so they decided to hire a professional.
I’d done some contract work for them in the past, and they knew I had some fine tunes. So when they reached out about...
Puzzle Hunt Credits
(Any omissions accidental and will be fixed ...
Some people have short ai timelines based inner models that don't communicate well. They might say "I think if company X trains according to new technique Y it should scale well and lead to AGI, and I expect them to use technique Y in the next few years", and the reasons for why they think technique Y should work are some kind of deep understanding built from years of reading ml papers, that's not particularly easy to transmit or debate.
In those cases, I want to avoid going into details and arguing directly, but would suggest that they use their deep knowl...
You can read the book on nanosyste.ms.
The book won the 1992 Award for Best Computer Science Book. The AI safety community often references it, as it describes a lower bound on what intelligence should probably be able to achieve.
Previously, you could only physically buy the book or read a PDF scan.
(Thanks to MIRI and Internet Archive for their scans.)
Worth following for his take (and YouTube videos he is creating): https://x.com/jacobrintamaki
[he's creating something around this]