Lightcone Infrastructure FundraiserGoal 1:$621,970 of $1,000,000
Customize
Rationality+Rationality+World Modeling+World Modeling+AIAIWorld OptimizationWorld OptimizationPracticalPracticalCommunityCommunity
Personal Blog+
sapphire222
19
Don't Induce psychosis intentionally. Don't take psychedelics while someone probes your beliefs. Don't let anyone associated with Michael Vasser anywhere near you during an altered state. Mike Vasser followers practice intentionally inducing psychosis via psychedelic drugs. Inducing psychosis is a verbatim self report of what they are doing. I would say they practice drug induced brain washing. TBC they would dispute the term brain washing and probably would not like the term 'followers' but I think the terms are accurate and they are certainly his intellectual descendants.  Several people have had quite severe adverse reactions (as observed by me). For example rapidly developing serious literal schizophrenia. Schizophrenia in the very literal sense of paranoid delusions and conspiratorial interpretations of other people's behavior. The local Vasserite who did the 'therapy'/'brainwashing' seems completely unbothered by this literal schizophrenia.  As you can imagine this behavior can cause substantial social disruption. Especially since the Vasserite's don't exactly believe in social harmony.  This has all precipitated serious mental health events in many other parties. Though they are less obviously serious than "they are clinically schizophrenic now".But that is a high bar. I have been very critical of cover ups in lesswrong. I'm not going to name names and maybe you don't trust me. But I have observed this all directly. If you are let people toy with your brain while you are under the influence of psychedelics you should expect high odds of severe consequences. And your friends mental health might suffer as well.   Edit: these are recent events. To my knowledge never referenced on lesswrong. 
If Biden pardons people like Fauci for crimes like perjury, that would set a bad precedent. There's a reason why perjury is forbidden and if you just give pardons to any government official who committed crimes at the end of an administration that's a very bad precedent. One way out of that would be to find a different way to punish government criminals when they are pardoned. One aspect of a pardon is that they remove the Fifth Amendment defense.  You can subpoena pardoned people in front of Congress and ask them under oath to speak about all the crimes they committed that they can't be prosecuted for because of the pardon. Then you can charge them for any lies where they didn't volunteer information about pardoned crimes they committed. 
Neel Nanda5315
0
A tip for anyone on the ML job/PhD market - people will plausibly be quickly skimming your google scholar to get a sense of "how impressive is this person/what is their deal" read (I do this fairly often), so I recommend polishing your Google scholar if you have publications! It can make a big difference. I have a lot of weird citable artefacts that confuse Google Scholar, so here's some tips I've picked up: * First, make a google scholar profile if you don't already have one! * Verify the email (otherwise it doesn't show up properly in search) * (Important!) If you are co-first author on a paper but not in the first position, indicate this by editing the names of all co-first authors to end in a * * You edit by logging in to the google account you made the profile with, going to your profile, clicking on the paper's name, and then editing the author's names * Co-first vs second author makes a big difference to how impressive a paper is, so you really want this to be clear! * Edit the venue of your work to be the most impressive place it was published, and include any notable awards from the venue (eg spotlight, oral, paper awards, etc). * You can edit this by clicking on the paper name and editing the journal field. * If it was a workshop, make sure you include the word workshop (otherwise it can appear deceptive). * See my profile for examples. * Hunt for lost citations: Often papers have weirdly formatted citations and Google scholar gets confused and thinks it was a different paper. You can often find these by clicking on the plus just below your profile picture then add articles, and then clicking through the pages for anything that you wrote. Add all these papers, and then use the merge function to combine them into one paper (with a combined citation count). * Merge lets you choose which of the merged artefacts gets displayed * Merge = return to the main page, click the tick box next to the paper titles, then clicking merge at th
Basically every time a new model is released by a major lab, I hear from at least one person (not always the same person) that it's a big step forward in programming capability/usefulness. And then David gives it a try, and it works qualitatively the same as everything else: great as a substitute for stack overflow, can do some transpilation if you don't mind generating kinda crap code and needing to do a bunch of bug fixes, and somewhere between useless and actively harmful on anything even remotely complicated. It would be nice if there were someone who tries out every new model's coding capabilities shortly after they come out, reviews it, and gives reviews with a decent chance of actually matching David's or my experience using the thing (90% of which will be "not much change") rather than getting all excited every single damn time. But also, to be a useful signal, they still need to actually get excited when there's an actually significant change. Anybody know of such a source? EDIT-TO-ADD: David has a comment below with a couple examples of coding tasks.
At long last, I'm delurking here. Hi!

Popular Comments

Recent Discussion

This is a brief summary of what we believe to be the most important takeaways from our new paper and from our findings shown in the o1 system card. We also specifically clarify what we think we did NOT show. 

Paper: https://www.apolloresearch.ai/research/scheming-reasoning-evaluations 

Twitter about paper: https://x.com/apolloaisafety/status/1864735819207995716 

Twitter about o1 system card: https://x.com/apolloaisafety/status/1864737158226928124 

What we think the most important findings are

Models are now capable enough to do in-context scheming reasoning

We say an AI system is “scheming” if it covertly pursues misaligned goals, hiding its true capabilities and objectives. We think that in order to scheme, models likely need to be goal-directed, situationally aware, and capable enough to reason about scheming as a strategy. In principle, models might acquire situational awareness and stable long-term goals during training, and then scheme in pursuit of those goals. We...

Seems like some measure of evidence -- maybe large, maybe tiny -- that "We don't know how to give AI values, just to make them imitate values" is false?

I am pessimistic about loss signals getting 1-to-1 internalised as goals or desires in a way that is predictable to us with our current state of knowledge on intelligence and agency, and would indeed tentatively consider this observation a tiny positive update.

5Aaron_Scher
This is important work, keep it up!
2Sodium
Almost certainly not original idea: Given the increasing fine-tuning access to models (see also the recent reinforcement fine tuning thing from OpenAI), see if fine tuning on goal directed agent tasks for a while leads to the types of scheming seen in the paper. You could maybe just fine tune on the model's own actions when successfully solving SWE-Bench  problems or something.  (I think some of the Redwood folks might have already done something similar but haven't published it yet?)
9Marius Hobbhahn
(thx to Bronson for privately pointing this out) I think directionally, removing parts of the training data would probably make a difference. But potentially less than we might naively assume, e.g. see Evan's argument on the XRCP podcast. Also, I think you're right, and my statement of "I think for most practical considerations, it makes almost zero difference." was too strong. 

The ACX/EA/LW Sofia Meetup for October will be on the 15th (Sunday) at 17:00 at the Mr. Pizza on Vasil Levski.

Sofia ACX started with the 2021 Meetups Everywhere round. Attendance hovers around 4-8 people. Everyone worries they're not serious enough about ACX to join, so you should banish that thought and come anyway.  "Please feel free to come even if you feel awkward about it, even if you’re not 'the typical ACX reader', even if you’re worried people won’t like you", even if you didn't come to the previous meetings, even if you don't speak Bulgarian, etc., etc.

Each month we pick something new to read and discuss. This time, we're discussing "Math's Fundamental Flaw," a Veritasium video on Gödel's Incompleteness Theorem.  

See you there.

If Biden pardons people like Fauci for crimes like perjury, that would set a bad precedent.

There's a reason why perjury is forbidden and if you just give pardons to any government official who committed crimes at the end of an administration that's a very bad precedent.

One way out of that would be to find a different way to punish government criminals when they are pardoned. One aspect of a pardon is that they remove the Fifth Amendment defense. 

You can subpoena pardoned people in front of Congress and ask them under oath to speak about all the crimes... (read more)

Epistemic status: Toy model. Oversimplified, but has been anecdotally useful to at least a couple people, and I like it as a metaphor.

Introduction

I’d like to share a toy model of willpower: your psyche’s conscious verbal planner “earns” willpower (earns trust with the rest of your psyche) by choosing actions that nourish your fundamental, bottom-up processes in the long run.  For example, your verbal planner might expend willpower dragging you to disappointing first dates, then regain that willpower, and more, upon finding you a good long-term romance.  Wise verbal planners can acquire large willpower budgets by making plans that, on average, nourish your fundamental processes.  Delusional or uncaring verbal planners, on the other hand, usually become “burned out” – their willpower budget goes broke-ish, leaving them little to...

Viliam20

one thing I valued highly was free time, and regardless of how much money and status a 40 hour a week job gives you, that's still 40 hours a week in which your time isn't free!

Yeah, the same here. The harder I work the more money I can get (though the relation is not linear; more like logarithmic), but at this point the thing I want it not money... it is free time!

I guess the official solution is to save money for early retirement. Which requires investing the money wisely, otherwise the inflation eats it.

By the way, perhaps you could have some people check your resume, maybe you are doing something wrong there.

3kwang
This reminds me of this post from Gena Gorlin (and the themes in her writing more generally): https://builders.genagorlin.com/p/death-is-the-default It doesn't quite map on precisely, but this quote seems to capture something you're trying also trying to get at: "On the contrary, remembering that “death is the default” should mobilize us to fight them with everything we’ve got—recognizing that the one thing we’ve got, in the fight against entropy, inertia, and death, is our power of agency." I also see a parallel to two different conceptions of ethics: * "Living willpower" is grounded in an ethics which is finite, "horizontal", and comes from within. Normative demands and ethical truths arise from mutual interdependence, from relationships, commitments, and attachments between finite agents. * "Dead willpower" is grounded in an ethics which is infinite, "vertical", and comes from the outside. Normative demands and ethical truths exist independent of the finite agents they constrain. They come from God, social pressures, ideologies, are unchanging facts about the world, etc. Lastly, I think your observation that healthy processes "must take an active interest in things they don't yet know" is perhaps a recognition that a key component of our finitude is our bounded rationality, and that we must recognize our limitations if we are to live well.

Related, here is something Yudkowsky wrote three years ago:

I'm about ready to propose a group norm against having any subgroups or leaders who tell other people they should take psychedelics.  Maybe they have individually motivated uses - though I get the impression that this is, at best, a high-variance bet with significantly negative expectation.  But the track record of "rationalist-adjacent" subgroups that push the practice internally and would-be leaders who suggest to other people that they do them seems just way too bad.

I'm also about read

... (read more)
2AprilSR
Some discussion of coverups can be found at https://www.lesswrong.com/posts/pQGFeKvjydztpgnsY/occupational-infohazards.
3Morphism
I think I know (80% confidence) the identity of this "local Vassarite" you are referring to, and I think I should reveal it, but, y'know, Unilateralist's Curse, so if anyone gives me a good enough reason not to reveal this person's name, I won't. Otherwise, I probably will, because right now I think people really should be warned about them.
1AprilSR
I'd appreciate a rain check to think about the best way to approach things. I agree it's probably better for more details here to be common knowledge but I'm worried about it turning into just like, another unnuanced accusation? Vague worries about Vassarites being culty and bad did not help me, a grounded analysis of the precise details might have.
This is a linkpost for https://doi.org/10.1093/pq/pqaa086

My paper with my Ph.D. advisor Vince Conitzer titled "Extracting Money from Causal Decision Theorists" has been formally published (Open Access) in The Philosophical Quarterly. Probably many of you have seen either earlier drafts of this paper or similar arguments that others have independently given on this forum (e.g., Stuart Armstrong posted about an almost identical scenario; Abram Demski's post on Dutch-Booking CDT also has some similar ideas) and elsewhere (e.g., Spencer (forthcoming)  and Ahmed (unpublished) both make arguments that resemble some points from our paper).

Our paper focuses on the following simple scenario which can be used to, you guessed it, extract money from causal decision theorists:

Adversarial Offer: Two boxes,  and , are on offer. A (risk-neutral) buyer may purchase one or none of the boxes but not both.

...
2Ben
I see where you are coming from. But, I think the reason we are interested in CDT (for any DT) in the first place is because we want to know which one works best. However, if we allow the outcomes to be judged not just on the decision we make, but also on the process used to reach that decision then I don't think we can learn anything useful. Or, to put it from a different angle, IF the process P is used to reach decision X, but my "score" depends not just on X but also P then that can be mapped to a different problem where my decision is "P and X", and I use some other process (P') to decide which P to use. For example, if a student on a maths paper is told they will be marked not just on the answer they give, but the working out they write on the paper - with points deducted for crossings outs or mistakes - we could easily imagine the student using other sheets of paper (or the inside of their head) to first work out the working they are going to show and the answer that goes with it. Here the decision problem "output" is the entire exame paper, not just the answer.
3Daniel Kokotajlo
I don't think I understand this yet, or maybe I don't see how it's a strong enough reason to reject my claims, e.g. my claim "If standard game theory has nothing to say about what to do in situations where you don't have access to an unpredictable randomization mechanism, so much the worse for standard game theory, I say!"
Ben20

I think we might be talking past each other. I will try and clarify what I meant.

Firstly, I fully agree with you that standard game theory should give you access to randomization mechanisms. I was just saying that I think that hypotheticals where you are judged on the process you use to decide, and not on your final decision are a bad way of working out which processes are good, because the hypothetical can just declare any process to be the one it rewards by fiat.

Related to the randomization mechanisms, in the kinds of problems people worry about with pre... (read more)

To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)
Log In Reset Password
...or continue with

TL:DR: Recently, Lucius held a presentation on the nature of deep learning and why it can generalise to new data. Kaarel, Dmitry and Lucius talked about the slides for that presentation in a group chat. The conversation quickly became a broader discussion on the nature of intelligence and how much we do or don't know about it. 

Background 

Lucius:  I recently held a small talk presenting an idea for how and why deep learning generalises. It tried to reduce concepts from Singular Learning theory back to basic algorithmic information theory to sketch a unified picture that starts with Solomonoff induction and, with a lot of hand waving, derives that under some assumptions, just fitting a big function to your data using a local optimisation method like stochastic gradient descent...

If it’s worth saying, but not worth its own post, here's a place to put it.

If you are new to LessWrong, here's the place to introduce yourself. Personal stories, anecdotes, or just general comments on how you found us and what you hope to get from the site and community are invited. This is also the place to discuss feature requests and other ideas you have for the site, if you don't want to write a full top-level post.

If you're new to the community, you can start reading the Highlights from the Sequences, a collection of posts about the core ideas of LessWrong.

If you want to explore the community more, I recommend reading the Library, checking recent Curated posts, seeing if there are any meetups in your area, and checking out the Getting Started section of the LessWrong FAQ. If you want to orient to the content on the site, you can also check out the Concepts section.

The Open Thread tag is here. The Open Thread sequence is here.

1Aristotelis Kostelenos
I've been lurking for not years. I also have ADHD and I deeply relate to your sentiment about the jargon here and it doesn't help that when I manage to concentrate enough to get through a post and read the 5 substack articles it links to and skim the 5 substack articles they link to, it's... pretty hit or miss. I remember reading one saying something about moral relativism not being obviously true and it felt like all the jargon and all the philosophical concepts mentioned only served to sufficiently confuse the reader (and I guess the writer too) so that it's not. I will say though that I don't get that feeling reading the sequences. Or stuff written by other rationalist GOATs. The obscure terms there don't serve as signals of the author's sophistication or ways to make their ideas less accesible. They're there because there are actually useful bundles of meaning that are used often enough to warrant a shortcut.
1Sherrinford
No, all tags are on default weight.
2habryka
Could you send me a screenshot of your post list and tag filter list? What you are describing sounds really very weird to me and something must be going wrong.

The list is very long, so it is hard to make a screenshot. Now with some hours of distance, I reloaded the homepage, tried again, and one 0 karma post appeared. (Last time, it did definitely not, I search very rigorously.)

However, the mathematical formular still tells me that all 0 karma post should appear at the same position, and negative karma posts below them?

Previously: Sadly, FTX

I doubted whether it would be a good use of time to read Michael Lewis’s new book Going Infinite about Sam Bankman-Fried (hereafter SBF or Sam). What would I learn that I did not already know? Was Michael Lewis so far in the tank of SBF that the book was filled with nonsense and not to be trusted?

I set up a prediction market, which somehow attracted over a hundred traders. Opinions were mixed. That, combined with Matt Levine clearly reporting having fun, felt good enough to give the book a try.

I need not have worried.

Going Infinite is awesome. I would have been happy with my decision on the basis of any one of the following:

The details I learned or clarified about the psychology of SBF...

+9. This is at times hilarious, at times upsetting story, of how a man gained a massive amount of power and built a corrupt empire. It's a psychological study, as well as a tale of a crime, hand-in-hand with a lot of naive ideologues.

I think it is worthwhile for understanding a lot about how the world currently works, including understanding individuals with great potential for harm, the crooked cryptocurrency industry, and the sorts of nerds in the world who falsely act in the name of good.

I don't believe that all the details here are fully accurate, but ... (read more)