Why do some societies exhibit more antisocial punishment than others? Martin explores both some literature on the subject, and his own experience living in a country where "punishment of cooperators" was fairly common.

William_S21hΩ561096
17
I worked at OpenAI for three years, from 2021-2024 on the Alignment team, which eventually became the Superalignment team. I worked on scalable oversight, part of the team developing critiques as a technique for using language models to spot mistakes in other language models. I then worked to refine an idea from Nick Cammarata into a method for using language model to generate explanations for features in language models. I was then promoted to managing a team of 4 people which worked on trying to understand language model features in context, leading to the release of an open source "transformer debugger" tool. I resigned from OpenAI on February 15, 2024.
habryka16h3315
3
Does anyone have any takes on the two Boeing whistleblowers who died under somewhat suspicious circumstances? I haven't followed this in detail, and my guess is it is basically just random chance, but it sure would be a huge deal if a publicly traded company now was performing assassinations of U.S. citizens.  Curious whether anyone has looked into this, or has thought much about baseline risk of assassinations or other forms of violence from economic actors.
Thomas Kwa6h13-3
5
You should update by +-1% on AI doom surprisingly frequently This is just a fact about how stochastic processes work. If your p(doom) is Brownian motion in 1% steps starting at 50% and stopping once it reaches 0 or 1, then there will be about 50^2=2500 steps of size 1%. This is a lot! If we get all the evidence for whether humanity survives or not uniformly over the next 10 years, then you should make a 1% update 4-5 times per week. In practice there won't be as many due to heavy-tailedness in the distribution concentrating the updates in fewer events, and the fact you don't start at 50%. But I do believe that evidence is coming in every week such that ideal market prices should move by 1% on maybe half of weeks, and it is not crazy for your probabilities to shift by 1% during many weeks if you think about it.
Dalcy19h356
1
Thoughtdump on why I'm interested in computational mechanics: * one concrete application to natural abstractions from here: tl;dr, belief structures generally seem to be fractal shaped. one major part of natural abstractions is trying to find the correspondence between structures in the environment and concepts used by the mind. so if we can do the inverse of what adam and paul did, i.e. 'discover' fractal structures from activations and figure out what stochastic process they might correspond to in the environment, that would be cool * ... but i was initially interested in reading compmech stuff not with a particular alignment relevant thread in mind but rather because it seemed broadly similar in directions to natural abstractions. * re: how my focus would differ from my impression of current compmech work done in academia: academia seems faaaaaar less focused on actually trying out epsilon reconstruction in real world noisy data. CSSR is an example of a reconstruction algorithm. apparently people did compmech stuff on real-world data, don't know how good, but effort-wise far too less invested compared to theory work * would be interested in these reconstruction algorithms, eg what are the bottlenecks to scaling them up, etc. * tangent: epsilon transducers seem cool. if the reconstruction algorithm is good, a prototypical example i'm thinking of is something like: pick some input-output region within a model, and literally try to discover the hmm model reconstructing it? of course it's gonna be unwieldly large. but, to shift the thread in the direction of bright-eyed theorizing ... * the foundational Calculi of Emergence paper talked about the possibility of hierarchical epsilon machines, where you do epsilon machines on top of epsilon machines and for simple examples where you can analytically do this, you get wild things like coming up with more and more compact representations of stochastic processes (eg data stream -> tree -> markov model -> stack automata -> ... ?) * this ... sounds like natural abstractions in its wildest dreams? literally point at some raw datastream and automatically build hierarchical abstractions that get more compact as you go up * haha but alas, (almost) no development afaik since the original paper. seems cool * and also more tangentially, compmech seemed to have a lot to talk about providing interesting semantics to various information measures aka True Names, so another angle i was interested in was to learn about them. * eg crutchfield talks a lot about developing a right notion of information flow - obvious usefulness in eg formalizing boundaries? * many other information measures from compmech with suggestive semantics—cryptic order? gauge information? synchronization order? check ruro1 and ruro2 for more.
Buck2dΩ31438
6
[epistemic status: I think I’m mostly right about the main thrust here, but probably some of the specific arguments below are wrong. In the following, I'm much more stating conclusions than providing full arguments. This claim isn’t particularly original to me.] I’m interested in the following subset of risk from AI: * Early: That comes from AIs that are just powerful enough to be extremely useful and dangerous-by-default (i.e. these AIs aren’t wildly superhuman). * Scheming: Risk associated with loss of control to AIs that arises from AIs scheming * So e.g. I exclude state actors stealing weights in ways that aren’t enabled by the AIs scheming, and I also exclude non-scheming failure modes. IMO, state actors stealing weights is a serious threat, but non-scheming failure modes aren’t (at this level of capability and dignity). * Medium dignity: that is, developers of these AIs are putting a reasonable amount of effort into preventing catastrophic outcomes from their AIs (perhaps they’re spending the equivalent of 10% of their budget on cost-effective measures to prevent catastrophes). * Nearcasted: no substantial fundamental progress on AI safety techniques, no substantial changes in how AI works. This subset of risk is interesting because I think it’s a natural scenario at which to target technical work on AI safety. (E.g. it’s the main scenario we’re targeting with our AI control agenda.) I claim that the majority of this risk comes from the lab using AIs internally to do AI development (by which I mean both research and engineering). This is because the AIs doing AI development naturally require access to compute and model weights that they can potentially leverage into causing catastrophic outcomes—in particular, those resources can be abused to run AIs unmonitored. Using AIs for AI development looks uniquely risky to me among applications of early-transformative AIs, because unlike all other applications I know about: * It’s very expensive to refrain from using AIs for this application. * There’s no simple way to remove affordances from the AI such that it’s very hard for the AI to take a small sequence of actions which plausibly lead quickly to loss of control. In contrast, most other applications of AI probably can be controlled just by restricting their affordances. If I’m right that the risk from scheming early-transformative models is concentrated onto this pretty specific scenario, it implies a bunch of things: * It implies that work on mitigating these risks should focus on this very specific setting. * It implies that AI control is organizationally simpler, because most applications can be made trivially controlled. * It is pretty plausible to me that AI control is quite easy, because you actually can remove affordances from the AIs that are doing AI R&D such that it’s hard for them to cause problems.

Popular Comments

Recent Discussion

The beauty industry offers a large variety of skincare products (marketed mostly at women), differing both in alleged function and (substantially) in price. However, it's pretty hard to test for yourself how much any of these product help. The feedback loop for things like "getting less wrinkles" is very long.

So, which of these products are actually useful and which are mostly a waste of money? Are more expensive products actually better or just have better branding? How can I find out?

I would guess that sunscreen is definitely helpful, and using some moisturizers for face and body is probably helpful. But, what about night cream? Eye cream? So-called "anti-aging"? Exfoliants?

Answer by ErisApprenticeMay 04, 202410

An important thing to keep in mind is that cosmetics companies don't necessarily have the money that e.g. pharmaceutical companies do to push large-scale studies on their products, so lack of evidence usually means a study wasn't done, rather than a study was done and found inconclusive.

If you haven't heard of it before, the subreddit 'SkincareAddiction' has some great recommendations for what's evidence-based and what works. 

1ophira4h
Yeah, glycolic acid is an exfoliant. The retinoid family also promotes cell turnover, but in a different way. You'd be over-exfoliating by using both of them at the same time.
4Razied14h
Weird side effect to beware for retinoids: they make dry eyes worse, and in my experience this can significantly decrease your quality of life, especially if it prevents you from sleeping well.
1nebuchadnezzar15h
Regarding sunscreens, Hyal Reyouth Moist Sun by the Korean brand Dr. Ceuracle is the most cosmetically elegant sun essence I have ever tried. It boasts SPF 50+, PA++++, chemical filters (no white cast) and is very pleasant to the touch and smell, not at all a sensory nightmare.

TLDR: Writing pseudocode is extremely useful when designing algorithms. Most people do it wrong. Mainly because they don't intentionally try to keep the code as abstract as possible. Possibly this happens because in normal programming a common workflow is to focus on getting the implementation of one function right before moving on.


I like it when I tell somebody about an idea, and they ask me "How would you write this in Python?" To be able to explain your idea to the Python interpreter, you need to have a really good understanding of it. Also trying to explain your idea to the Python interpreter often forces you to improve your understanding.

However, in my experience, it is much better to start by writing high-level pseudocode, to iteratively build...

3Johannes C. Mayer1h
Yes, I totally agree. But I kind of got distracted with this post and wanted to get back to work as quickly as possible. But instead of making it a perpetual draft I pushed a bit further and got some MVP thing. I agree it's not that good, and adding concrete examples, and really making this more like a tutorial would be the first step I would take. So I am wondering if doing something like this is still useful. Would it be worse if I had made it a perpetual draft or better?
Algon15m20

Probably it would have been worse as a perpetual draft.

Shannon Vallor is the first Baillie Gifford Chair in the Ethics of Data and Artificial Intelligence at the Edinburgh Futures Institute. She is the author of Technology and the Virtues: A Philosophical Guide to a Future Worth Wanting. She believes that we need to build and cultivate a new virtue ethics appropriate to our technological era. The following summarizes some of her arguments and observations:

Why Do We Need New Virtues?

Vallor believes that we need to discover and cultivate a new set of virtues that is appropriate to the onslaught of technological change that marks our era. This for several reasons, including:

  • Technology is extending our reach, so that our decisions have effects with broader ethical implications than traditional moral wisdom is prepared to cope with.
  • Technology is also starting
...
p.b.20m10

I think all the assumptions that go into this model are quite questionable, but it's still an interesting thought.

2Seth Herd2h
But... Why would p(doom) move like Brownian motion until stopping at 0 or 1? I don't disagree with your conclusions, there's a lot of evidence coming in, and if you're spending full time or even part time thinking about alignment, a lot of important updates on the inference. But assuming a random walk seems wrong. Is there a reason that a complex, structured unfolding of reality would look like a random walk?
2Thomas Kwa4h
To some degree yes, but I expect lots of information to be spread out across time. For example: OpenAI releases GPT5 benchmark results. Then a couple weeks later they deploy it on ChatGPT and we can see how subjectively impressive it is out of the box, and whether it is obviously pursuing misaligned goals. Over the next few weeks people develop post-training enhancements like scaffolding, and we get a better sense of its true capabilities. Over the next few months, debate researchers study whether GPT4-judged GPT5 debates reliably produce truth, and control researchers study whether GPT4 can detect whether GPT5 is scheming. A year later an open-weights model of similar capability is released and the interp researchers check how understandable it is and whether activation steering still works.
4niplav5h
Thank you a lot! Strong upvoted. I was wondering a while ago whether Bayesianism says anything about how much my probabilities are "allowed" to oscillate around—I was noticing that my probability of doom was often moving by 5% in the span of 1-3 weeks, though I guess this was mainly due to logical uncertainty and not empirical uncertainty.

I say this because I can hardly use a computer without constantly getting distracted. Even when I actively try to ignore how bad software is, the suggestions keep coming.

Seriously Obsidian? You could not come up with a system where links to headings can't break? This makes you wonder what is wrong with humanity. But then I remember that humanity is building a god without knowing what they will want.

So for those of you who need to hear this: I feel you. It could be so much better. But right now, can we really afford to make the ultimate <programming language/text editor/window manager/file system/virtual collaborative environment/interface to GPT/...>?

Can we really afford to do this while our god software looks like...

May this find you well.

1Ustice2h
I don’t know about making god software, but human software is a lot of trial and error. I have been writing code for close to 40 years. The best I can do is write automated tests to anticipate the kinds of errors I might get. My imagination just isn’t as strong as reality. There is provably no way to fully predict how a software system of sufficient complexity. With careful organization it becomes easier to reason about and predict, but unless you are writing provable software (it’s a very slow and complex process, I hear), that’s the best you get. I feel you on being distracted by software bugs. I’m one of those guys that reports them, or even code change suggestions (GitHub Pull Requests).

I don’t know about making god software, but human software is a lot of trial and error. I have been writing code for close to 40 years. The best I can do is write automated tests to anticipate the kinds of errors I might get. My imagination just isn’t as strong as reality.

I think it is incorrect to say that testing things fully formally is the only alternative to whatever the heck we are currently doing. I mean there is property-based testing as a first step (which maybe you also refer to with automated tests but I would guess you are probably mainly talki... (read more)

2Johannes C. Mayer2h
"If you are assuming Software works well you are dead" because: * If you assume this you will be shocked by how terrible software is every moment you use a computer, and your brain will constantly try to fix it wasting your time. * You should not assume that humanity has it in it to make the god software without your intervention. * When making god software: Assume the worst.

The Edge home page featured an online editorial that downplayed AI art because it just combines images that already exist. If you look closely enough, human artwork is also combinations of things that already existed.

One example is Blackballed Totem Drawing: Roger 'The Rajah' Brown. James Pate drew this charcoal drawing in 2016. It was the Individual Artist Winner of the Governor's Award for the Arts. At the microscopic scale, this artwork is microscopic black particles embedded in a large sheet of paper. I doubt he made the paper he drew on, and the black... (read more)

To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)
Log In Reset Password
...or continue with

Happy May the 4th from Convergence Analysis! Cross-posted on the EA Forum.

As part of Convergence Analysis’s scenario research, we’ve been looking into how AI organisations, experts, and forecasters make predictions about the future of AI. In February 2023, the AI research institute Epoch published a report in which its authors use neural scaling laws to make quantitative predictions about when AI will reach human-level performance and become transformative. The report has a corresponding blog post, an interactive model, and a Python notebook.

We found this approach really interesting, but also hard to understand intuitively. While trying to follow how the authors derive a forecast from their assumptions, we wrote a breakdown that may be useful to others thinking about AI timelines and forecasting. 

In what follows, we set out our interpretation of...

2Martín Soto6h
What's PPU?

From here:

Profit Participation Units (PPUs) represent a unique compensation method, distinct from traditional equity-based rewards. Unlike shares, stock options, or profit interests, PPUs don't confer ownership of the company; instead, they offer a contractual right to participate in the company's future profits.

1mishka8h
No, OpenAI (assuming that it is a well-defined entity) also uses a probability distribution over timelines. (In reality, every member of its leadership has its own probability distribution, and this translates to OpenAI having a policy and behavior formulated approximately as if there is some resulting single probability distribution). The important thing is, they are uncertain about timelines themselves (in part, because no one knows how perplexity translates to capabilities, in part, because there might be difference with respect to capabilities even with the same perplexity, if the underlying architectures are different (e.g. in-context learning might depend on architecture even with fixed perplexity, and we do see a stream of potentially very interesting architectural innovations recently), in part, because it's not clear how big is the potential of "harness"/"scaffolding", and so on). This does not mean there is no political infighting. But it's on the background of them being correctly uncertain about true timelines... ---------------------------------------- Compute-wise, inference demands are huge and growing with popularity of the models (look how much Facebook did to make LLama 3 more inference-efficient). So if they expect models to become useful enough for almost everyone to want to use them, they should worry about compute, assuming they do want to serve people like they say they do (I am not sure how this looks for very strong AI systems; they will probably be gradually expanding access, and the speed of expansion might depend).
6LawrenceC9h
When I spoke to him a few weeks ago (a week after he left OAI), he had not signed an NDA at that point, so it seems likely that he hasn't.

Hello, friends.

This is my first post on LW, but I have been a "lurker" here for years and have learned a lot from this community that I value.

I hope this isn't pestilent, especially for a first-time post, but I am requesting information/advice/non-obvious strategies for coming up with emergency money.

I wouldn't ask except that I'm in a severe financial emergency and I can't seem to find a solution. I feel like every minute of the day I'm butting my head against a brick wall trying and failing to figure this out.

I live in a very small town in rural Arizona. The local economy is sustained by fast food restaurants, pawn shops, payday lenders, and some huge factories/plants that are only ever hiring engineers and other highly specialized personnel.

I...

1Tigerlily5h
Thank you for this. I'm not eligible for it but I will send it to my sister who is. She needs emergency dental work but the health insurance plan offered through her employer doesn't cover it so she's just been suffering through the pain. So really, thank you. She will be so glad.
1Tigerlily5h
Thank you for the thoughtful suggestions. Aella is exemplary but camgirling strikes me as a nightmare. I have considered making stuff, like custom glasses/premium drinkware, and selling on Etsy but the market seems saturated and I've never had the money to buy the equipment to learn the skills required to do this kind of thing. I am certified in Salesforce and could probably get hired helping to manage the Salesforce org for my tribe (Cherokee Nation) but would have to move to Oklahoma. I've applied for every grant I can find that I'm eligible for, but there's not much out there and the competition is stiff. We will figure out something, I'm sure. If we don't, there's nothing standing between us and homelessness and that reality fills me with anger and despair. I feel like there's nothing society wants from me, so there's no way for me to convince society that I deserve anything from it. It's so hard out here.
1Tigerlily12h
Thank you for your response. I probably should have given a more exhaustive list of things I have already tried. Other than a couple things you mentioned, I have already tried the rest. Before becoming a stay-at-home parent, I was a writer. I wasn't well paid but was starting to earn professional rates when I got pregnant with my second child and that took over my life. I have found it difficult to start writing again since then. The industry has changed so much and is changing still, and so am I. My life is so different now. I'm less sure of what I write - no longer young enough to know everything, as Oscar Wilde said. I feel like I'm trying to leap onto a speeding train from the ground, like I'm watching for an open doorway or a platform I can grab onto as the train roars past me at 100mph. My children -- yes, I have children. They are with their dad most of the time. It was his mother's house we were living in when the domestic violence situation got so severe that the courts got involved and separated us, and when that happened it was I who had to leave. His mother was not about to turn out her son and let me stay in her house, especially since he was the breadwinner and the one paying rent to her. And I was not going to drag my children into a precarious housing situation. There are no emergency housing resources where I live aside from shelters which are known for being miserable, overcrowded, prison-like, and difficult to get into anyway. So my children have stayed in the safety of their dad's home. His mother came from across the state to help, and while I'm relieved to see that she is taking the responsibility of caring for them seriously, she is also tenaciously possessive over them. This is still very painful for me to talk about.
nim1h20

Ah, so you have skill and a portfolio in writing. You have the cognitive infrastructure to support using the language as art. That infrastructure itself is what you should be trying to rent to tech companies -- not the art it's capable of producing.

If the art part of writing is out of reach for you right now, that's ok -- it's almost a benefit in this case, because if it's not around it can't feel left out if you turn to more pragmatic ends the skills you used to celebrate it with.

Normally I wouldn't suggest startups, because they're so risky/uncertain... ... (read more)

LessOnline

A Festival of Writers Who are Wrong on the Internet

May 31 - Jun 2, Berkeley, CA