This is a linkpost for https://nathanpmyoung.substack.com/p/questions-are-too-cheap

It is easier to ask than to answer.

That’s my whole point.

It is much cheaper to ask questions than answer them so beware of situations where it is implied that asking and answering are equal.

Here are some examples:

Let's say there is a maths game. I get a minute to ask questions. You get a minute to answer them. If you answer them all correctly, you win, if not, I do. Who will win?

Preregister your answer.

Okay, let's try. These questions took me roughly a minute to come up with.

What's 56,789 * 45,387?

What's the integral from -6 to 5π of sin(x cos^2(x))/tan(x^9) dx?

What's the prime factorisation of 91435293173907507525437560876902107167279548147799415693153?

Good luck. If I understand correctly, that last one's gonna take you at least an hour¹ (or however long it takes to threaten...

(Continue Reading – 1669 more words)

Dagon2m20

I agree with your assertion that pure factual questions are cheaper and easier than (correct) answers. I fully disagree with the premise that they're currently "too cheap".

I see many situations where questions and answers are treated as symmetric.

I see almost none. I see MANY situations where both are cheap, but even then answers are more useful and valued. I see others where finding the right questions is valued, but answering is even more so. And plenty where the answer isn't available, but the thinking about how to get closer to ... (read more)

Ethics and prospects of AI related jobs?

dr_s

I've been on the lookout for new jobs recently and one thing I have noticed is that the market seems flooded with ads for AI-related jobs. What I mean is not work on building models (or aligning them, alas), but rather, work on building applications using generative AI or other advances to make new software products. My impression of this is that first, there's probably something of a bubble, because I doubt many of these ideas can deliver on their promises, especially as they rely so heavily on still pretty unreliable LLMs and such. And second, that while the jobs are well paid and sound fun, I'm not sure how I feel about them. These jobs all essentially aim at automating away other jobs, one way or another....

(See More – 47 more words)

Answer by DagonMay 11, 202420

It's definitely overhyped. I hesitate to call it a bubble - it's more like the normal software business model with a new cover. Tons of projects and startups with pretty tenuous business models and improbable grand visions, most of which will peter out after a few years. But that has been going on for decades, and will likely continue until true AI makes it all irrelevant.

Most of these jobs are less interesting, and less impactful than they claim. Which makes the ethical considerations far less important. My advice is to focus on th... (read more)

2dr_s3h

I suppose I'm mostly also looking for aspects of this I might have overlooked, or inside perspective about any details from someone who has relevant experience. I think I tend to err a bit on caution on things but ultimately I believe that "staying pure" is rarely a road to doing good (at most it's a road to not doing bad, but that's relatively easy if you just do nothing at all). Some of the problems with automation would have applied to many of the previous rounds of it, and those ultimately came out mostly good, I think, but also it somehow feels This Time It's Different (but then again, I do tend to skew towards pessimism and seeing all the possible ways things can go wrong...).

3Jay Bailey3h

I guess my way of thinking of it is - you can automate tasks, jobs, or people. Automating tasks seems probably good. You're able to remove busywork from people, but their job is comprised of many more things than that task, so people aren't at risk of losing their jobs. (Unless you only need 10 units of productivity, and each person is now producing 1.25 units so you end up with 8 people instead of 10 - but a lot of teams could also quite use 12.5 units of productivity well) Automating jobs is...contentious. It's basically the tradeoff I talked about above. Automating people is bad right now. Not only are you eliminating someone's job, you're eliminating most other things this person could do at all. This person has had society pass them by, and I think we should either not do that or make sure this person still has sufficient resources and social value to thrive in society despite being automated out of an economic position. (If I was confident society would do this, I might change my tune about automating people) So, I would ask myself - what type of automation am I doing? Am I removing busywork, replacing jobs entirely, or replacing entire skillsets? (Note: You are probably not doing the last one. Very few, if any, are. The tech does not seem there atm. But maybe the company is setting themselves up to do so as soon as it is, or something) And when you figure out what type you're doing, you can ask how you feel about that.

2dr_s2h

A fair point. I suppose part of my doubt though is exactly: are most of these applications going to automate jobs, or merely tasks? And to what extent does contributing to either advance the know how that might eventually help automating people?

Selfmaker662's Shortform

Selfmaker662

Shoshannah Tekofsky29m30

I discovered the Netherlands actually has a good dating app that doesn't exist outside of it... I'm rather baffled. I have no idea how they started. I've messaged them asking if they will localize and expand and they thanked me for the compliment so... Dunno?

It's called Paiq and has a ton of features I've never seen before, like speed dating, picture hiding by default, quizzes you make for people that they can try to pass to get a match with you, photography contacts that involve taking pictures of stuff around and getting matched on that, and a few other things... It's just this grab bag of every way to match people that is not your picture or a blurb. It's really good!

2dr_s1h

Has anyone ever tried outlining a straight up first come first served system? Vet and pay a first batch of VIP users, then offer incentives to later joiners (eg vouchers for other products), then just free users, and finally introduce fees after reaching a certain user base, all committed to and outlined transparently from the beginning of course.

2Seth Herd2h

You need to have bunches of people use it for it to be any good, no matter how good the algorithm.

1Selfmaker6625h

Right, I completely missed the network effects, 5 minutes of thinking through wasn’t enough. May be there even are good apps there, which didn’t make it through the development and marketing part. Thanks, Vanessa!

MATS Winter 2023-24 Retrospective

Rocket, Ryan Kidd, LauraVaughan, McKennaFitzgerald, Christian Smith, Juan Gil, Henry Sleight

15h

Co-Authors: @Rocket, @Ryan Kidd, @LauraVaughan, @McKennaFitzgerald, @Christian Smith, @Juan Gil, @Henry Sleight

The ML Alignment & Theory Scholars program (MATS) is an education and research mentorship program for researchers entering the field of AI safety. This winter, we held the fifth iteration of the MATS program, in which 63 scholars received mentorship from 20 research mentors. In this post, we motivate and explain the elements of the program, evaluate our impact, and identify areas for improving future programs.

Summary

Key details about the Winter Program:

The four main changes we made after our Summer program were:
- Reducing our scholar stipend from $40/h to $30/h based on alumni feedback;
- Transitioning Scholar Support to Research Management;
- Using the full Lighthaven campus for office space as well as housing;
- Replacing Alignment 201 with AI Strategy Discussions.
Educational attainment of MATS scholars:
- 48% of scholars

...

(Continue Reading – 14651 more words)

OliverHayman2h10

I'm noticing there are still many interp mentors for the current round of MATS -- was the "fewer mech interp mentors" change implemented for this cohort, or will that start in Winter or later?

2Sheikh Abdur Raheem Ali11h

I love this report! Shed a tear at not seeing Microsoft on the organization interest chart though 🥲. We could be a better Bing T_T.

1Ryan Kidd10h

Oh, I think we forgot to ask scholars if they wanted Microsoft at the career fair. Is Microsoft hiring AI safety researchers?

1Sheikh Abdur Raheem Ali6h

Yes, here’s an open position: Research Scientist - Responsible & OpenAI Research. Of course, responsible AI differs from interpretability, activation engineering, or formal methods (e.g., safeguarded AI, singular learning theory, agent foundations). I’ll admit we are doing less of that than I’d prefer, partially because OpenAI shares some of its ‘secret safety sauce’ with us, though not all, and not immediately. Note from our annual report that we are employing 1% fewer people than this time last year, so headcount is a very scarce resource. However, the news reported we invested ~£2.5b in setting up a new AI hub in London under Jordan Hoffman, with 600 new seats allocated to it (officially, I can neither confirm nor deny these numbers). I’m visiting there this June after EAG London. We’re the only member of the Frontier Model Forum without an alignment team. MATS scholars would be excellent hires for such a team, should one be established. Some time ago, a few colleagues helped me draft a white paper to internally gather momentum and suggest to leadership that starting one there might be beneficial. Unfortunately, I am not permitted to discuss the responses or any future plans regarding this matter.

Do you know of lists of p(doom)s/AI forecasts/ AI quotes?

Nathan Young

I am trying to gather a list of answers/quotes from public figures to the following questions:

What are the chances that AI will cause human extinction?
Will AI automate most human labour?
Should advanced AI models be open source?
Do humans have a moral duty to build artificial superintelligence?
Should there be international regulation of advanced AI?
Will AI be used to make weapons of mass destruction (WMDs)?

I am writing them down here if you want to look/help: https://docs.google.com/spreadsheets/d/1HH1cpD48BqNUA1TYB2KYamJwxluwiAEG24wGM2yoLJw/edit?usp=sharing

Nathan Young2h20

Thank you, this is the kind of thing I was hoping to find.

Dating Roundup #3: Third Time’s the Charm

Zvi

The first speculated on why you’re still single. We failed to settle the issue. A lot of you were indeed still single. So the debate continues.

The second gave more potential reasons, starting with the suspicion that you are not even trying, and also many ways you are likely trying wrong.

The definition of insanity is trying the same thing over again expecting different results. Another definition of insanity is dating in 2024. Can’t quit now.

You’re Single Because Dating Apps Keep Getting Worse

A guide to taking the perfect dating app photo. This area of your life is important, so if you intend to take dating apps seriously then you should take photo optimization seriously, and of course you can then also use the photos for other things.

I love the...

(Continue Reading – 11504 more words)

lc2h20

Manifold Love: pro-tip: if a woman measures her hand against yours, this is almost always flirtation.

Totally did not know this. Is this true? [10% react x 2]

A little taken aback by this response. It's not just flirting, it's outright romantic. Asking this is like asking if a woman resting their head on a mans chest and purring is "flirting". I didn't realize this was a common experience for guys not in a relationship with the particular woman.

1rotatingpaguro9h

Ok, then I agreed. I was interpreting the advice in a different way, but your interpretation looks more reasonable.

1Curt Tigges9h

Very kind of you to say. :) I think for me, though, the source of the emotion I felt when reading this series was something like: "Ah, so in addition to ensuring we are dateable ourselves, we must fix society, capitalism (at least the dating part of it), culture, etc. in order to have a Good Dating Universe." Which in retrospect was a bit overblown of me, so I think I no longer endorse the strong version of what I said in that comment.

1quiet_NaN16h

This made me laugh out loud. Otherwise, my idea for a dating system would be that given that the majority of texts written will invariably end up being LLM-generated, it would be better if every participant openly had an AI system as their agent. Then the AI systems of both participants could chat and figure out how their user would rate the other user based on their past ratings of suggestions. If the users end up being rated among each others five most viable candidates, Of course, if the agents are under the full control of the users, the next step of escalation will be that users will tell their agents to lie on their behalf. ('I am into whatever she is into. If she is big on horses, make up a cute story about me having had a pony at some point. Just put the relevant points on the cheat sheet for the date'.) This might be solved by having the LLM start by sending out a fixed text document. If horses are mentioned as item 521, after entomology but before figure skating, the user is probably not very interested in them. Of course, nothing would prevent a user from at least generically optimizing their profile to their target audience. "A/B testing has shown that the people you want to date are mostly into manga, social justice and ponies, so this is what you should put on your profile." Adversarially generated boyfriend?

To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)

Maximal Lottery-Lotteries Exist

Lorxus

epistemic/ontological status: almost certainly all of the following -

a careful research-grade writeup of a genuinely kinda shiny open(?) question in theoretical psephology that will likely never see actual serious non-cooked-up use;
dedicated to a very dear cat;
utterly dependent, for the entirety of the most interesting parts, on definitions I have come up with and results I have personally proven partially using them, which I have done with a professional mathematician's care; some friends and strangers have also checked them over;
my attempt to prove that something that can reasonably be called a maximal lottery-lottery exists;
my attempt to scavenge what others have left behind and craft a couple of missing pieces, and then to lay out a blueprint for how it could begin to work;
not a 30-minute read
the first half

...

(Continue Reading – 7695 more words)

Lorxus3h10

To avoid confusion: this post and my reply to it were also on a past version of this post; that version lacked any investigation of dominance criterion desiderata for lottery-lotteries.

Dyslucksia

Shoshannah Tekofsky

The curious tale of how I mistook my dyslexia for stupidity - and talked, sang, and drew my way out of it.

Sometimes I tell people I’m dyslexic and they don’t believe me. I love to read, I can mostly write without error, and I’m fluent in more than one language.

Also, I don’t actually technically know if I’m dyslectic cause I was never diagnosed. Instead I thought I was pretty dumb but if I worked really hard no one would notice. Later I felt inordinately angry about why anyone could possibly care about the exact order of letters when the gist is perfectly clear even if if if I right liike tis.

I mean, clear to me anyway.

I was 25 before it dawned on me that all the tricks...

(Continue Reading – 1770 more words)

1Shoshannah Tekofsky3h

Thanks! :D Attention is a big part of it for me as well, yes. I feel it's very easy to notice when I skip words when reading out loud, and getting the cadence of a sentence right only works if you have a sense of how it relates to the previous and next one.

2Aprillion (Peter Hozák)6h

Yeah, I myself subvocalize absolutely everything and I am still horrified when I sometimes try any "fast" reading techniques - those drain all of the enjoyment our of reading for me, as if instead of characters in a story I would imagine them as p-zombies. For non-fiction, visual-only reading cuts connections to my previous knowledge (as if the text was a wave function entangled to the rest of the universe and by observing every sentence in isolation, I would collapse it to just "one sentence" without further meaning). I never move my lips or tongue though, I just do the voices (obviously, not just my voice ... imagine reading Dennett without Dennett's delivery, isn't that half of the experience gone? how do other people enjoy reading with most of the beauty missing?). It's faster then physical speech for me too, usually the same speed as verbal thinking.

Lorxus3h20

Yeah, I myself subvocalize absolutely everything and I am still horrified when I sometimes try any "fast" reading techniques - those drain all of the enjoyment our of reading for me, as if instead of characters in a story I would imagine them as p-zombies.

I speed-read fiction, too. When I do, though, I'll stop for a bit whenever something or someone new is being described, to give myself a moment to picture it in a way that my mind can bring up again as set dressing.

1Shoshannah Tekofsky3h

That sounds great! I have to admit that I still get a far richer experience from reading out loud than subvocalizing, and my subvocalizing can't go faster than my speech. So it sounds like you have an upgraded form with more speed and richness, which is great!

Creating unrestricted AI Agents with a refusal-vector ablated Llama 3 70B

Simon Lermen

15h

TLDR; I demonstrate the use of refusal vector ablation on Llama 3 70B to create a bad agent that can attempt malicious tasks such as trying to persuade and pay me to assassinate another individual. I introduce some early work on a benchmark for Safe Agents which comprises two small datasets, one benign, one bad. In general, Llama 3 70B is a competent agent with appropriate scaffolding, and Llama 3 8B also has decent performance.

Overview

In this post, I use insights from mechanistic interpretability to remove safety guardrails from the latest Llama 3 model. I then use a custom scaffolding for tool use and agentic planning to create a “bad” agent that can perform many unethical tasks. Examples include tasking the AI with persuading me to end the life of...

(Continue Reading – 2019 more words)

2the gears to ascension5h

You sure could have waited a day or two for someone else to get around to this. No reason to be the person who burns the last two days. (Of course, as usual, this would be better aimed upstream many steps. But it's the marginal difference that can be changed.)

3Simon Lermen4h

I also took into account that refusal-vector ablated models are available on huggingface and scaffolding, this post might still give it more exposure though. Also Llama 3 70B performs many unethical tasks without any attempt at circumventing safety. At that point I am really just applying a scaffolding. Do you think it is wrong to report on this? How could this go wrong, people realize how powerful this is and invest more time and resources into developing their own versions? I don't really think of this as alignment research, just want to show people how far along we are. Positive impact could be to prepare people for these agents going around, agents being used for demos. Also potentially convince labs to be more careful in their releases.

3Simon Lermen4h

Thanks for this comment, I take it very serious that things can inspire people and burn timeline. I think this is a good counterargument though: There is also something counterintuitive to this dynamic: as models become stronger, the barriers to entry will actually go down; i.e. you will be able to prompt the AI to build its own advanced scaffolding. Similarly, the user could just point the model at a paper on refusal-vector ablation or some other future technique and ask the model to essentially remove its own safety. I don't want to give people ideas or appear cynical here, sorry if that is the impression.

the gears to ascension3h31

No particular disagreement that your marginal contribution is low and that this has the potential to be useful for durable alignment. Like I said, I'm thinking in terms of not burning days with what one doesn't say.

LESSWRONG
LW

Quick Takes

Popular Comments

Recent Discussion

Here are some examples:

Summary

You’re Single Because Dating Apps Keep Getting Worse

Overview

LessOnline Festival

Ticket prices increase in 3 days