Why do some societies exhibit more antisocial punishment than others? Martin explores both some literature on the subject, and his own experience living in a country where "punishment of cooperators" was fairly common.
The beauty industry offers a large variety of skincare products (marketed mostly at women), differing both in alleged function and (substantially) in price. However, it's pretty hard to test for yourself how much any of these product help. The feedback loop for things like "getting less wrinkles" is very long.
So, which of these products are actually useful and which are mostly a waste of money? Are more expensive products actually better or just have better branding? How can I find out?
I would guess that sunscreen is definitely helpful, and using some moisturizers for face and body is probably helpful. But, what about night cream? Eye cream? So-called "anti-aging"? Exfoliants?
Yeah, glycolic acid is an exfoliant. The retinoid family also promotes cell turnover, but in a different way. You'd be over-exfoliating by using both of them at the same time. There's a whole art to combining actives. This is one of the reasons that I work with a dermatologist; especially when you're starting out, it can be helpful to have a medical professional monitoring you and making sure you don't accidentally burn your face off.
To some degree yes, but I expect lots of information to be spread out across time. For example: OpenAI releases GPT5 benchmark results. Then a couple weeks later they deploy it on ChatGPT and we can see how subjectively impressive it is out of the box, and whether it is obviously pursuing misaligned goals. Over the next few weeks people develop post-training enhancements like scaffolding, and we get a better sense of its true capabilities. Over the next few months, debate researchers study whether GPT4-judged GPT5 debates reliably produce truth, and contro...
Basically all ideas/insights/research about AI is potentially exfohazardous. At least, it's pretty hard to know when some ideas/insights/research will actually make things better; especially in a world where building an aligned superintelligence (let's call this work "alignment") is quite harder than building any superintelligence (let's call this work "capabilities"), and there's a lot more people trying to do the latter than the former, and they have a lot more material resources.
Ideas about AI, let alone insights about AI, let alone research results about AI, should be kept to private communication between trusted alignment researchers. On lesswrong, we should focus on teaching people the rationality skills which could help them figure out insights that help them build any superintelligence, but are more likely to first give them insights...
Daniel, your interpretation is literally contradicted by Eliezer's exact words. Eliezer defines dignity as that which increases our chance of survival.
""Wait, dignity points?" you ask. "What are those? In what units are they measured, exactly?"
And to this I reply: Obviously, the measuring units of dignity are over humanity's log odds of survival - the graph on which the logistic success curve is a straight line. A project that doubles humanity's chance of survival from 0% to 0% is helping humanity die with one additional information-theoretic bit of dignity."
Produced as part of the MATS Winter 2024 program, under the mentorship of Alex Turner (TurnTrout).
TL,DR: I introduce a method for eliciting latent behaviors in language models by learning unsupervised perturbations of an early layer of an LLM. These perturbations are trained to maximize changes in downstream activations. The method discovers diverse and meaningful behaviors with just one prompt, including perturbations overriding safety training, eliciting backdoored behaviors and uncovering latent capabilities.
Summary In the simplest case, the unsupervised perturbations I learn are given by unsupervised steering vectors - vectors added to the residual stream as a bias term in the MLP outputs of a given layer. I also report preliminary results on unsupervised steering adapters - these are LoRA adapters of the MLP output weights of a given...
TruthX: Alleviating Hallucinations by Editing Large Language Models in Truthful Space seems to be using a contrastive approach for steering vectors (I've only skimmed though), it might be worth having a look.
Hello, friends.
This is my first post on LW, but I have been a "lurker" here for years and have learned a lot from this community that I value.
I hope this isn't pestilent, especially for a first-time post, but I am requesting information/advice/non-obvious strategies for coming up with emergency money.
I wouldn't ask except that I'm in a severe financial emergency and I can't seem to find a solution. I feel like every minute of the day I'm butting my head against a brick wall trying and failing to figure this out.
I live in a very small town in rural Arizona. The local economy is sustained by fast food restaurants, pawn shops, payday lenders, and some huge factories/plants that are only ever hiring engineers and other highly specialized personnel.
I...
Thank you for this. I'm not eligible for it but I will send it to my sister who is. She needs emergency dental work but the health insurance plan offered through her employer doesn't cover it so she's just been suffering through the pain. So really, thank you. She will be so glad.
Claude learns across different chats. What does this mean?
I was asking Claude 3 Sonnet "what is a PPU" in the context of this thread. For that purpose, I pasted part of the thread.
Claude automatically assumed that OA meant Anthropic (instead of OpenAI), which was surprising.
I opened a new chat, copying the exact same text, but with OA replaced by GDM. Even then, Claude assumed GDM meant Anthropic (instead of Google DeepMind).
This seemed like interesting behavior, so I started toying around (in new chats) with more tweaks to the prompt to check its ro...
What's PPU?
This is a thread for updates about the upcoming LessOnline festival. I (Ben) will be posting bits of news and thoughts, and you're also welcome to make suggestions or ask questions.
If you'd like to hear about new updates, you can use LessWrong's "Subscribe to comments" feature from the triple-dot menu at the top of this post.
Reminder that you can get tickets at the site for $400 minus your LW karma in cents.
Health and longevity blogger from Unaging.com here. I've submitted talks on optimal diet, optimal exercise, how to run sub 3:30 for your first marathon, and sugar is fine -- fight me!
Looking forward to extended, rational health discussions!
NOTE: This post was updated to include two additional models which meet the criteria for being considered Open Source AI.
As advanced machine learning systems become increasingly widespread, the question of how to make them safe is also gaining attention. Within this debate, the term “open source” is frequently brought up. Some claim that open sourcing models will potentially increase the likelihood of societal risks, while others insist that open sourcing is the only way to ensure the development and deployment of these “artificial intelligence,” or “AI,” systems goes well. Despite this idea of “open source” being a central debate of “AI” governance, there are very few groups that have released cutting edge “AI” which can be considered Open Source.
...Although the training process, in theory, can be wholly defined by source code, this is generally not practical, because doing so would require releasing (1) the methods used to train the model, (2) all data used to train the model, and (3) so called “training checkpoints” which are snapshots of the state of the model at various points in the training process.
Exactly. Without the data, the model design cannot be trained again, and you end up fine-tuning a black box (the "open weights").
Thanks for writing this.