Recent Discussion

Facebook AI releases a new SOTA "weakly semi-supervised" learning system for video and image classification. I'm posting this here because even though it's about capabilities, the architecture includes a sort-of-similar-to amplification component where a higher capacity teacher decides how to train a lower capacity student model.

3gwern5h a sort-of-similar-to amplification component where a higher capacity teacher decides how to train a lower capacity student model. This is the first example I've seen of this overseer/machine-teaching style approach scaling up to such a data-hungry classification task. What's special there is the semi-supervised part (the training on unlabeled data to get pseudo-labels to then use in the student model's training). Using a high capacity teacher on hundreds of millions of images is not all that new: for example, Google was doing that on its JFT dataset (then ~100m noisily-labeled images) back in at least 2015, given "Distilling the Knowledge in a Neural Network" [https://arxiv.org/abs/1503.02531], Hinton, Vinyals & Dean 2015. Or Gao et al 2017 which goes the other direction and tries to distill dozens of teachers into a single student using 400m images in 100k classes. (See also: Gross et al 2017 [https://arxiv.org/abs/1704.06363]/Sun et al 2017 [https://arxiv.org/abs/1707.02968]/Gao et al 2017 [https://arxiv.org/abs/1711.07607]/Shazeer et al 2018 [https://arxiv.org/abs/1701.06538]/Mahajan et al 2018 [https://research.fb.com/publications/exploring-the-limits-of-weakly-supervised-pretraining/] /Yalniz et al 2019 [https://arxiv.org/abs/1905.00546] or GPipe scaling to 1663-layer/83.4b-parameter Transformers [https://arxiv.org/pdf/1811.06965.pdf#page=4])
1An1lam3h Interesting, I somehow hadn't seen this. Thanks! (Editing to reflect this as well.) I'm curious - even though this isn't new, do you agree with my vague claim that the fact that this and the paper you linked work pertains to the feasibility of amplification-style strategies?

I'm not sure. Typically, the justification for these sorts of distillation/compression papers is purely compute: the original teacher model is too big to run on a phone or as a service (Hinton), or too slow, or would be too big to run at all without 'sharding' it somehow, or it fits but training it to full convergence would take too long (Gao). You don't usually see arguments that the student is intrinsically superior in intelligence and so 'amplified' in any kind of AlphaGo-style way which is one of the more common examples for amplification. They do do s

... (Read more)(Click to expand thread. ⌘F to Expand All)Cmd/Ctrl F to expand all comments on this post

[Epistemic status: Sharing current impressions in a quick, simplified way in case others have details to add or have a more illuminating account. Medium-confidence that this is one of the most important parts of the story.]


Here's my current sense of how we ended up in this weird world where:

  • I still intermittently run into people who claim that there's no such thing as reality or truth;
  • a lot of 20th-century psychologists made a habit of saying things like 'minds don't exist, only behaviors';
  • a lot of 20th-century physicists made a habit of saying things like 'quarks
... (Read more)
2Said Achmiz1h We can imagine a weird alien race (or alien AI) that has extremely flawed sensory faculties, and very good introspection. A race like that might be able to bootstrap to good science, via leveraging their introspection to spot systematic ways in which their sensory faculties fail, and sift out the few bits of reliable information about their environments. I don’t think I can imagine this, actually. It seems to me to be somewhat incoherent. How exactly would this race “spot systematic ways in which their sensory faculties fail”? After all, introspection does no good when it comes to correcting errors of perception of the external world… Or am I misunderstanding your point…?
2Rob Bensinger28m A simple toy example would be: "You have perfect introspective access to everything about how your brain works, including how your sensory organs work. This allows you to deduce that your external sensory organs provide noise data most of the time, but provide accurate data about the environment anytime you wear blue sunglasses at night."

I confess I have trouble imagining this, but it doesn’t seem contradictory, so, fair enough, I take your point.

2Said Achmiz1h I don’t read Dennett as referring to social acceptability or “norms of science” (except insofar as those norms are taken to constitute epistemic best practices from a personal standpoint, which I think Dennett does assume to some degree—but no more than is, in my view, warranted). a more honest approach would say “yeah, in principle introspective arguments are totally admissible, they just have to do a bit more work than usual because we’re giving them a lower prior (for reasons X, Y, Z)” Sure. Heterophenomenology is that “more work”. Introspective arguments are admissible; they’re admissible as heterophenomenological evidence. It is indisputably the case that Chalmers, for instance, makes arguments along the lines of “there are further facts revealed by introspection that can’t be translated into words”. But it is not only not indisputably the case, but indeed can’t ever (without telepathy etc., or maybe not even then) be shown to another person, or perceived by another person, to be the case, that there are further facts revealed by introspection that can’t be translated into words. Indeed it’s not even clear how you’d demonstrate to yourself that what your introspection reveals is real. Certainly you’re welcome to “take introspection’s word for it”—but then you don’t need science of any kind. That I experience what I experience, seems to me to need no demonstration or proof; how can it be false, after all? Even in principle? But then what use is arguing whether a Bayesian approach to demonstrating this not-in-need-of-demonstration fact is best, or some other approach? Clearly, whatever heterophenomenology (or any other method of investigation) might be concerned with, it’s not that. But now I’m just reiterating Dennett’s arguments. I guess what I’m saying is, I think your responses to Dennett are mostly mis-aimed. I think the rebuttals are already contained in what he’s written on the subject.

Several friends are collecting signatures to put Instant-runoff Voting, branded as Ranked Choice Voting, on the ballot in Massachusetts ( Ballotpedia, full text). I'm glad that an attempt to try a different voting method is getting traction, but I'm frustrated that they've chosen IRV. While every voting method has downsides, IRV is substantially worse than some other decent options.

Imagine that somehow the 2016 presidential election had been between Trump, Clinton, and Kasich, and preferences had looked like:

  • 35% of people: Trump, Kasich, Clinton
  • 14% of people: Kasich, Trump
... (Read more)
About 2/3 of people prefer Kasich to any other candidate on offer [...]

I think this is misleadingly phrased. It's true that Kasich wins by 2/3 to 1/3 no matter which other candicate you pit him against, but it's not true that 2/3 of people prefer him to any other candidate on offer. Only 14% + 17% = 1/3 of people have him as their first choice.

Your thesis stands, though, and I've updated on it.

Find all Alignment Newsletter resources here. In particular, you can sign up, or look through this spreadsheet of all summaries that have ever been in the newsletter. I'm always happy to hear feedback; you can send it to me by replying to this email.

This is a bonus newsletter summarizing Stuart Russell's new book, along with summaries of a few of the most relevant papers. It's entirely written by Rohin, so the usual "summarized by" tags have been removed.

We're also changing the publishing schedule: so far, we've aimed to send a newsletter every Monday; we&a... (Read more)

12rohinmshah4h I mentioned in my opinion that I think many of my disagreements are because of an implicit disagreement on how we build powerful AI systems: the book has an implied stance towards the future of AI research that I don't agree with: I could imagine that powerful AI systems end up being created by learning alone without needing the conceptual breakthroughs that Stuart outlines.I didn't expand on this in the newsletter because I'm not clear enough on the disagreement; I try to avoid writing very confused thoughts that say wrong things about what other people believe in a publication read by a thousand people. But that's fine for a comment here! Rather than attribute a model to Stuart, I'm just going to make up a model that was inspired by reading HC, but wasn't proposed by HC. In this model, we get a superintelligent AI system that looks like a Bayesian-like system that explicitly represents things like "beliefs", "plans", etc. Some more details: * Things like 'hierarchical planning' are explicit algorithms. Simply looking at the algorithm can give you a lot of insight into how it does hierarchy. You can inspect things like "options" just by looking at inputs/outputs to the hierarchical planning module. The same thing applies for e.g. causal reasoning. * Any black box deep learning system is only used to provide low-level inputs to the real 'intelligence', in the same way that for humans vision provides low-level inputs for the rest of cognition. We don't need to worry about the deep learning system "taking over", in the same way that we don't worry about our vision module "taking over". * The AI system was created by breakthroughs in algorithms for causal reasoning, hierarchical planning, etc, that allow it to deal with the combinatorial explosion caused by the real world. As a result, it is very cheap to run (i.e. doesn't need a huge amount of compute). This is more compatible with a discontinuous takeoff, though a continuous
6rohinmshah5h If you're curious about how I select what goes in the newsletter: I almost put in this critical review [https://www.nature.com/articles/d41586-019-02939-0] of the book, in the spirit of presenting both sides of the argument. I didn't put it in because I couldn't understand it. My best guess right now is that the author is arguing that "we'll never get superintelligence", possibly because intelligence isn't a coherent concept, but there's probably something more that I'm not getting. If it turned out that it was only saying "we'll never get superintelligence", and there weren't any new supporting arguments, I wouldn't include it in the newsletter, because we've seen and heard that counterargument more than enough.

They also made an error in implicitly arguing that because they didn't think unaligned behavior seems intelligent, then we have nothing to worry about from such AI - they wouldn't be "intelligent". I think leaving this out was a good choice.

13rohinmshah5h I enjoyed pages 185-190, on mathematical guarantees, especially because I've been confused about what the "provably beneficial" in CHAI's mission statement [https://humancompatible.ai/about] is meant to say. Some quotes: On the other hand, if you want to prove something about the real world—for example, that AI systems designed like so won’t kill you on purpose—your axioms have to be true in the real world. If they aren’t true, you’ve proved something about an imaginary world.On the applicability of theorems to practice: The trick is to know how far one can stray from the real world and still obtain useful results. For example, if the rigid-beam assumption allows an engineer to calculate the forces in a structure that includes the beam, and those forces are small enough to bend a real steel beam by only a tiny amount, then the engineer can be reasonably confident that the analysis will transfer from the imaginary world to the real world.as well as The process of removing unrealistic assumptions continues until the engineer is fairly confident that the remaining assumptions are true enough in the real world. After that, the engineered system can be tested in the real world; but the test results are just that. They do not prove that the same system will work in other circumstances or that other instances of the system will behave the same way as the original. It then talks about assumption failure in cryptography due to side-channel attacks. A somewhat more concrete version of what "provably beneficial" might mean: Let’s look at the kind of theorem we would like eventually to prove about machines that are beneficial to humans. One type might go something like this: Suppose a machine has components A, B, C, connected to each other like so and to the environment like so, with internal learning algorithms lA, lB, lC that optimize internal feedback rewards rA, rB, rC defined like so, and [a few more conditions] . . . then, with very high probability, the machine’s b
Algorithms of Deception!
85h5 min readShow Highlight
...

Category gerrymandering doesn’t seem like a different algorithm from selective reporting. In both cases, the reporter is providing only part of the evidence.

At any one time I usually have between 1 and 3 "big ideas" I'm working with. These are generally broad ideas about how some thing works with many implications for how the rest of the whole world works. Some big ideas I've grappled with over the years, in roughy historical order:

  • evolution
  • everything is computation
  • superintelligent AI is default dangerous
  • existential risk
  • everything is information
  • Bayesian reasoning is optimal reasoning
  • evolutionary psychology
  • Getting Things Done
  • game theory
  • developmental psychology
  • positive psychology
  • phenomenology
  • AI alignment is not defined precisely
... (Read more)
1An1lam7h Have you read any of Cosma Shalizi's stuff on computational mechanics [http://bactra.org/notebooks/computational-mechanics.html]? Seems very related to your interests.

I had not seen that, thank you.

7Answer by James_Miller8h We should make thousands of clones of John von Neumann from his DNA. We don't have the technology to do this yet, but the upside benefit would be so huge it would be worth spending a few billion to develop the technology. A big limitation on the historical John von Neumann's productivity was not being able to interact with people of his own capacity. There would be regression to the mean with the clones' IQ, but the clones would have better health care and education than the historical von Neumann did plus the Flynn effect might come into play.
4Viliam2h What exactly is the secret ingredient of "being John von Neumann"? Is it mostly biological, something like unparalleled IQ; or rather a rare combination of very high (but not unparalleled) IQ with very good education? Because if it's the latter, then you could create a proper learning environment, where only kids with sufficiently high IQ would be allowed. The raw material is out there; you would need volunteers, but a combination of financial incentives and career opportunities could get you some. (The kids would get paid for going there and following the rules. And even if they fail to become JvNs, they would still get great free education, so there is nothing to lose.) Any billionaire could do this as a private project. (This is in my opinion where organizations like Mensa fail. They collect some potentially good material, but then do nothing about it. It's just "let's get them into the same room, and wait for a miracle to happen", and... surprise, surprise... what happens instead is some silly signaling games, like people giving each other pointless puzzles. An ideal version that I imagine would collect the high-IQ people, offer them free rationality training, and the ones who passed it would be split according to their interests -- math, programing, entrepreneurship... -- and provided coaching. Later, the successful ones would be honor-bound to donate money to organization and provide coaching for the next generation. That is, instead of passively waiting for the miracle to happen, nudge people as hard as you can.)
...
2rohinmshah3h My opinion, also going into the newsletter: Like Matthew, I'm excited to see more work on transparency and adversarial training for inner alignment. I'm a somewhat skeptical of the value of work that plans to decompose future models into a "world model", "search" and "objective": I would guess that there are many ways to achieve intelligent cognition that don't easily factor into any of these concepts. It seems fine to study a system composed of a world model, search and objective in order to gain conceptual insight; I'm more worried about proposing it as an actual plan.
1evhub1h The point about decompositions is a pretty minor portion of this post; is there a reason you think that part is more worthwhile to focus on for the newsletter?

I'm not Rohin, but I think there's a tendency to reply to things you disagree with rather than things you agree with. That would explain my emphasis anyway.

This is a response to Abram's The Parable of Predict-O-Matic, but you probably don't need to read Abram's post to understand mine. While writing this, I thought of a way in which I think things could wrong with dualist Predict-O-Matic, which I plan to post in about a week. I'm offering a $100 prize to the first commenter who's able to explain how things might go wrong in a sufficiently crisp way before I make my follow-up post.

Dualism

Currently, machine learning algorithms are essentially "Cartesian dualists" when it comes to themselves and their environment. (Not a philosophy major -- let

... (Read more)
1evhub1h that suggested to me that there were 2 instances of this info about Predict-O-Matic's decision-making process in the dataset whose description length we're trying to minimize. "De-duplication" only makes sense if there's more than one. Why is there more than one? ML doesn't minimize the description length of the dataset—I'm not even sure what that might mean—rather, it minimizes the description length of the model. And the model does contain two copies of information about Predict-O-Matic's decision-making process—one in its prediction process and one in its world model. The prediction machinery is in code, but this code isn't part of the info whose description length is attempting to be minimized, unless we take special action to include it in that info. That's the point I was trying to make previously. Modern predictive models don't have some separate hard-coded piece that does prediction—instead you just train everything. If you consider GPT-2, for example, it's just a bunch of transformers hooked together. The only information that isn't included in the description length of the model is what transformers are, but "what's a transformer" is quite different than "how do I make predictions." All of the information about how the model actually makes its predictions in that sort of a setup is going to be trained.

I think maybe what you're getting at is that if we try to get a machine learning model to predict its own predictions (i.e. we give it a bunch of data which consists of labels that it made itself), it will do this very easily. Agreed. But that doesn't imply it's aware of "itself" as an entity. And in some cases the relevant aspect of its internals might not be available as a conceptual building block. For example, a model trained using stochastic gradient descent is not necessarily better at understanding or predicting a process which is very similar t

... (Read more)(Click to expand thread. ⌘F to Expand All)Cmd/Ctrl F to expand all comments on this post
4abramdemski15h To highlight the "blurry distinction" more: In situations like that, you get into an optimized fixed point over time, even though the learning algorithm itself isn't explicitly searching for that. Note, if the prediction algorithm anticipates this process (perhaps partially), it will "jump ahead", so that convergence to a fixed point happens more within the computation of the predictor (less over steps of real world interaction). This isn't formally the same as searching for fixed points internally (you will get much weaker guarantees out of this haphazard process), but it does mean optimization for fixed point finding is happening within the system under some conditions.
2John_Maxwell18h But if we put a Predict-O-Matic in the real world, let it generate predictions, and then define the loss according to what happens afterwards, a non-dualistic Predict-O-Matic will be selected for over dualistic variants. Yes, that sounds more like reinforcement learning. It is not the design I'm trying to point at in this post. If you still disagree with that, what do you think would happen (in the limit of infinite training time) with an algorithm that just made a random change proportional to how wrong it was, at every training step? That description sounds a lot like SGD. I think you'll need to be crisper for me to see what you're getting at.
Maybe Lying Doesn't Exist
576d7 min readShow Highlight

In "Against Lie Inflation", the immortal Scott Alexander argues that the word "lie" should be reserved for knowingly-made false statements, and not used in an expanded sense that includes unconscious motivated reasoning. Alexander argues that the expanded sense draws the category boundaries of "lying" too widely in a way that would make the word less useful. The hypothesis that predicts everything predicts nothing: in order for "Kevin lied" to mean something, some possible states-of-affairs need to be identified as not lying, so that the statement "Kevin lied" can correspond to redistributing

... (Read more)

I do agree that it's important to have the "are they actively adversarial" hypothesis and corresponding language. (This is why I've generally argued against the conflation of lying and rationalization).

But I also think, at least in most of the disagreements and conflicts I've seen so far, much of the problem has had more to do with rationalization (or, in some cases, different expectations of how much effort to put into intellectual integrity)

I think there is also an undercurrent of genuine conflict (as people jockey for money/sta... (Read more)(Click to expand thread. ⌘F to Expand All)Cmd/Ctrl F to expand all comments on this post

13Vladimir_Nesov8h correctly weigh these kinds of considerations against each on a case by case basis The very possibility of intervention based on weighing map-making and planning against each other destroys their design, if they are to have a design. It's similar to patching a procedure in a way that violates its specification in order to improve overall performance of the program or to fix an externally observable bug. In theory this can be beneficial, but in practice the ability to reason about what's going on deteriorates.
3Wei_Dai5h In theory this can be beneficial, but in practice the ability to reason about what’s going on deteriorates. I think (speaking from my experience) specifications are often compromises in the first place between elegance / ease of reasoning and other considerations like performance. So I don't think it's taboo to "patch a procedure in a way that violates its specification in order to improve overall performance of the program or to fix an externally observable bug." (Of course you'd have to also patch the specification to reflect the change and make sure it doesn't break the rest of the program, but that's just part of the cost that you have to take into account when making this decision.) Assuming you still disagree, can you explain why in these cases, we can't trust people to use learning and decision theory (i.e., human approximations to EU maximization or cost-benefit analysis) to make decisions, and we instead have to make them follow a rule (i.e., "don't ever do this")? What is so special about these cases? (Aren't there tradeoffs between ease of reasoning and other considerations everywhere?) Or is this part of a bigger philosophical disagreement between rule consequentialism and act consequentialism, or something like that?
Planned Power Outages
287d1 min readShow Highlight

With the dubiously motivated PG&E blackouts in California there are many stories about how lack of power is a serious problem, especially for people with medical dependencies on electricity. Examples they give include people who:

  • Have severe sleep apnea, and can't safely sleep without a CPAP.

  • Sleep on a mattress that needs continous electricity to prevent it from deflating.

  • Need to keep their insulin refrigerated.

  • Use a medicine delivery system that requires electricity every four hours to operate.

This outage was dangerous for them and others, but it also see... (Read more)

2jkaufman2h I found https://www.energymadeeasy.gov.au/sites/default/files/1519_AER Life Support DL Brochure_D02.pdf [https://www.energymadeeasy.gov.au/sites/default/files/1519_AER%20Life%20Support%20DL%20Brochure_D02.pdf] which seems to say: * You're responsible for figuring out backup power for your medical equipment * If you register with your utility they have to notify you before they turn off your power, but unexpected outages can still happen. This doesn't sound that different from most countries? And sounds much less strict that you were describing. Registering looks like visiting https://www.synergy.net.au/Your-home/Manage-account/Register-for-life-support [https://www.synergy.net.au/Your-home/Manage-account/Register-for-life-support] or the equivalent for your utility. I also found https://www.aemc.gov.au/sites/default/files/content/a4094ca5-dc6a-4dfb-bbe7-8aa9a3baa831/Life-Support-rule-change-RRC0009-Final-Rule-For-Publication.pdf [https://www.aemc.gov.au/sites/default/files/content/a4094ca5-dc6a-4dfb-bbe7-8aa9a3baa831/Life-Support-rule-change-RRC0009-Final-Rule-For-Publication.pdf] which gives what I think are the full rules with obligations for retailers and distributors, which doesn't change my understanding from above. The only way it looks like this would have been different in Australia is that the power company would have been required to give more notice.

Specifically, they talk about: "retailer planned interruptions", "distributor planned interruptions", and "unplanned interruptions". And then they say:

  • The retailer can't intentionally turn off the power except by following the rules for "retailer planned interruptions", which include "4 business days written notice".

  • Same for the distributor, for "distributor planned interruptions"

I'm having trouble finding the official rules, but I found an example commercial contract (https://www.essentialenergy.com.au/-/media/Project/EssentialEnergy/Website/Fil

... (Read more)(Click to expand thread. ⌘F to Expand All)Cmd/Ctrl F to expand all comments on this post

This is part of a weekly reading group on Nick Bostrom's book, Superintelligence. For more information about the group, and an index of posts so far see the announcement post. For the schedule of future topics, see MIRI's reading guide.


Welcome. This week we discuss the seventh section in the reading guideDecisive strategic advantage. This corresponds to Chapter 5.

This post summarizes the section, and offers a few relevant notes, and ideas for further investigation. Some of my own thoughts and questions for discussion are in the comments.

There is no need to pro... (Read more)

I think you are looking at this wrong. Yes, they had help from local rebellions and malcontents. So would an AGI. An AGI taking over the world wouldn't necessarily look like robots vs. humans; it might look like the outbreak of World War 3 between various human factions, except that the AGI was manipulating things behind the scenes and/or acting as a "strategic advisor" to one of the factions. And when the dust settles, somehow the AGI is in charge...

So yeah, I think it really is fair to say that the Spanish managed to conquer empires of mil... (Read more)(Click to expand thread. ⌘F to Expand All)Cmd/Ctrl F to expand all comments on this post

This post is for you if:

  1. Projects that excite you are growing to be a burden on your to-do list
  2. You have a nagging sense that you’re not making the most of the ideas you have every day
  3. Your note- and idea-system has grown to be an unwieldy beast

Years ago, I ready David Allen’s “Getting Things Done”. One of the core ideas is to write down everything, collect it in an inbox and sort it once a day.

This lead to me writing down tons of small tasks. I used Todoist to construct a system that worked for me — and rarely missed tasks.

It also lead to me getting a lot of id... (Read more)

Could you post a link to Roam? Or tell me where to find it? Google and Google Play are drawing blanks....

Cheers!

11pjeby4h Um, isn't that basically a wiki? I looked at the website and don't see anything right off that indicates how it's different from any other personal wiki tool. It even seems to be using the same double-square-bracket link syntax used by many wiki tools. On a closer look at the one available screenshot, I think I see that the difference might be that instead of just a list of "pages that link here", the tool provides a list of "paragraphs or bullet points that link here", and that perhaps the wiki pages themselves are outlines? Actually, that makes a lot of sense... and probably is better than what I'm doing with DynaList right now. Signing up... and, ok, so it's interesting. The outliner UX is kind of basic and really lacking in features I'm used to with other outliners. For example, I can't paste anything into it from my other outliners -- pasting multiline text results in a single outline item with indentation, instead of separate bullet points. Worse, I can't copy out either, or at least haven't figured out how to yet. That seems to make this an information silo that doesn't play well with other tools. After some experimenting with "Export" I find I can copy and paste that into a markdown editor and get a bullet list, but not something I can paste into actual outlining tools using e.g. tab indentation or OPML. The export is also lossy, losing any line breaks or indentation in code blocks. And using it is awkward, as hitting ^A to "select all" in the export ends up selecting the rest of the page, not just the export bit. I was hoping "view as document" plus "export" would let me at least extract a markdown page, but it goes back to bullet points in the export. In order to get a non-lossy export, you have to "Export All" (meaning your entire database(!), and it uses a weird asterisk-indented format that is compatible with exactly nothing. Overall this is an intriguing idea for a tool, but the execution isn't something I'd trust with important data, with the lac
1ryqiem3h Hi pjeby, thanks for your comments! Just to be clear, I have no affiliation with Roam nor am I part of their development. I'm a user just like everyone else. I use Workflowy for mobile capture and can copy to/from it just fine. I use Chrome on macOS for Roam (through Nativefier), so I don't know why that isn't consistent. I've added it to their bug-report (which currently lives on Slack, very alpha!) The interface scales really well, so if you want larger text (as I did), I highly recommend simply zooming in the browser. The reading may be a subjective thing, I quite like it. I'm sure interface customisations are going to be in the works. Linking to/from bullet-points and having backlinks show up is a large part of the draw for me.
2pjeby2h I use Workflowy for mobile capture and can copy to/from it just fine. Depending on the direction of copy/pasting, I either ended up with huge blobs of text in one item, or flat lists without any indentation. i.e., I couldn't manage structure-preserving interchange with any other tool except (ironically enough) my markdown editor, Typora. A bullet-point list or paragraphs from Typora would paste into Roam with structure, and I could also do the reverse. But markdown bullet point lists aren't really interchangeable with any other tools I use, so it's not a viable bridge to my other outlining tools.
The strategy-stealing assumptionΩ
541mo11 min readΩ 20Show Highlight

Suppose that 1% of the world’s resources are controlled by unaligned AI, and 99% of the world’s resources are controlled by humans. We might hope that at least 99% of the universe’s resources end up being used for stuff-humans-like (in expectation).

Jessica Taylor argued for this conclusion in Strategies for Coalitions in Unit-Sum Games: if the humans divide into 99 groups each of which acquires influence as effectively as the unaligned AI, then by symmetry each group should end, up with as much influence as the AI, i.e. they should end up with 99% of the influence.

This argument rests on what I... (Read more)

I wrote this post imagining "strategy-stealing assumption" as something you would assume for the purpose of an argument, for example I might want to justify an AI alignment scheme by arguing "Under a strategy-stealing assumption, this AI would result in an OK outcome." The post was motivated by trying to write up another argument where I wanted to use this assumption, spending a bit of time trying to think through what the assumption was, and deciding it was likely to be of independent interest. (Although that hasn't yet appeared i... (Read more)(Click to expand thread. ⌘F to Expand All)Cmd/Ctrl F to expand all comments on this post

Noticing Frame Differences
13120d8 min readShow Highlight

Previously: Keeping Beliefs Cruxy


When disagreements persist despite lengthy good-faith communication, it may not just be about factual disagreements – it could be due to people operating in entirely different frames — different ways of seeing, thinking and/or communicating.

If you can’t notice when this is happening, or you don’t have the skills to navigate it, you may waste a lot of time.

Examples of Broad Frames

Gears-oriented Frames

Bob and Alice’s conversation is about cause and effect. Neither of them are planning to take direct actions based on their conve... (Read more)

Thirding what the others said, but I wanted to also add that rather than actual game theory, what you may be looking here may instead be the anthropological notion of limited good?

Introduction

Internal Family Systems (IFS) is a psychotherapy school/technique/model which lends itself particularly well for being used alone or with a peer. For years, I had noticed that many of the kinds of people who put in a lot of work into developing their emotional and communication skills, some within the rationalist community and some outside it, kept mentioning IFS.

So I looked at the Wikipedia page about the IFS model, and bounced off, since it sounded like nonsense to me. Then someone brought it up again, and I thought that maybe I should reconsider. So I looked at the WP page again... (Read more)

Huh. This does not resonate with my experience, but I will henceforth be on the lookout for this.

To be fair, I doubt that my sample size of such individuals is statistically significant. But since in the few times a client has brought up IFS and either enthusiastically extolled it or seemed to be wanting me to validate it as something they should try, it seemed to me to be related to either the person's schema of helplessness (i.e., these parts are doing this to me), or of denial (i.e., I would be successful if I could just fix all these broken parts!)

... (Read more)(Click to expand thread. ⌘F to Expand All)Cmd/Ctrl F to expand all comments on this post
2mr-hire8h Focusing focuses on a single "felt sense", rather than an integrated system of felt senses that aren't viewed as seperate. In general I think you're quite confused about how most people use the parts terminology if you think felt senses aren't referring to parts, which typically represent a "belief cluster" and visual, kinesthetic, or auditory representation of that belief cluster, often that's anthropomorphized. Note that parts can be different sizes, and you can have a "felt sense" related to a single belief, or clusters of beliefs. Actually, I'm generally confused because without the mental state used by Focusing, Core Transformation, the Work, and Sedona don't work properly, if at all. So I don't understand how it could be separate. Similarly, I can see how CBT could be considered dissociated, but not Focusing. You're confusing dissociation and integration here again, so I'll just address the dissociation part. Note that all the things I'm saying here are ORTHOGONAL to the issue of "parts". Yes, focusing is in one sense embodied and experiential as opposed to something like CBT. However, this stuff exists on a gradient, and in focusing the embodiment is explicitly dissociated from and viewed as other. Here's copypasta from twitter: Here's a quote from http://focusing.org [https://l.facebook.com/l.php?u=http%3A%2F%2Ffocusing.org%2F%3Ffbclid%3DIwAR3RZw2o6vp_4z3bMRczOCrZZUDAC74bpnWoFzNlq6MqWimFn9mKRnf4wtg&h=AT1jT4MKg3ON-S--MOcoLhg6SczaXBW1uHcO_cpnG4YM37PFBoqpW2T4KKzeyERD3vVFGjKWHFVe3bm78cFcJcbuWih0uU7xOxnyPvmqYFvHtrktTobsF09pa9Aakl4nQw] that points towards a dissociative stance: " When some concern comes, DO NOT GO INSIDE IT. Stand back, say "Yes, that’s there. I can feel that, there." Let there be a little space between you and that." I've heard an acquaintance describe a session with Anne Weiser-Cornell where they kept trying to say "this is my feeling" and she kept correcting to "this feeling in my body", which again is more of a dissociative stance. Now
2pjeby6h I've heard an acquaintance describe a session with Anne Weiser-Cornell where they kept trying to say "this is my feeling" and she kept correcting to "this feeling in my body", which again is more of a dissociative stance. I was under the impression that IFS calls that "unblending", just as ACT calls it "de-fusing". I personally view it more as a stance of detachment or curiosity neutral observation. But I don't object to someone saying "I feel X", because that's already one step removed from "X"! If somebody says, "everything is awful" they're blended or fused or whatever you want to call it. They're taking the map as equivalent to the territory. Saying, "It feels like everything is awful" or "I feel awful" is already one level of detachment, and an okay place to start from. In common psychotherapy, I believe the term "dissociation" is usually associated with much greater levels of detachment than this, unless you're talking about NLP. The difference in degree is probably why ACT and IFS and others have specialized terms like "unblending" to distinguish between this lesser level of detachment, and the type of dissociative experience that comes with say, trauma, where people experience themselves as not even being in their body. Honestly, if somebody is so "in their head" that they don't experience their feelings, I have to go the opposite route of making them more associated and less detached, and I have plenty of tools for provoking feelings in order to access them. I don't want complete dissociation from feelings, nor complete blending with them, and ISTM that almost everything on your chart is actually targeted at that same sweet spot or "zone" of detached-but-not-too-detached. In touch with your experience, but neither absorbed by it nor turning your back on it. Anyway, I think maybe I understand the terms you're using now, and hopefully you understand the ones I'm using. Within your model I still don't know what you'd call what I'm doing, since my "Collect
2pjeby18h What are you imagining would be the case if IFS was literally true, and subagents were real, instead of "just a metaphor"? Well, for one thing, that they would intelligently shift their behavior to achieve their outcomes, rather than stupidly continuing things that don't work any more. That would be one implication of agency. Also, if IFS were literally true, and "subagents" were the atomic unit of behavior, then the UTEB model shouldn't work, and neither should mine or many other modalities that operate on smaller, non-intentional units. In fact, I dislike the word "subagent", because it imports implications that might not hold. A part might be agent-like, but it also might be closer to an urge or a desire or an impulse. Ah! Now we're getting somewhere. In my frame, an urge, desire or impulse is a reaction. The "response" in stimulus-response. Which is why I want to pin down "when does this thing happen?", to get the stimulus part that goes with it. To my understanding the key idea of the "parts" framing, is that I should assume, by default, that each part is acting from a model, a set of beliefs about the world or my goals. That is, my desire/ urge / reflex, is not "mindless": it can update. I see it differently: we have mental models of the world, that contain "here are some things that might be good to do in certain situations", where "things to do" can include "how you should feel, so as to bias towards a certain category of behaviors that might be helpful based on what we know". (And the actions or feelings listed in the model can be things other people did or felt!) In other words, the desire or urge is the output of a lookup table, and the lookup table can be changed. But both the urge and the lookup table are dumb, passive, and prefer not to update if at all possible. (To the extent that information processed through the lookup table will be distorted to reinforce the validity of what's already in the lookup table.) Even in the cases where somebody
Declarative Mathematics
607mo2 min readShow Highlight

Programmers generally distinguish between “imperative” languages in which you specify what to do (e.g. C) versus “declarative” languages in which you specify what you want, and let the computer figure out how to do it (e.g. SQL). Over time, we generally expect programming to become more declarative, as more of the details are left to the compiler/interpreter. Good examples include the transition to automated memory management and, more recently, high-level tools for concurrent/parallel programming.

It’s hard to say what programming languages will look like in twenty or fifty years, but it’s a p... (Read more)

To your request for examples, my impression is that Black Box Variational Inference is slowly but surely becoming the declarative replacement to MCMC for a lot of generative modeling stuff.

Goals such as resource acquisition and self-preservation are convergent in that they occur for a superintelligent AI for a wide range of final goals.

Is the tendency for an AI to amend its values also convergent?

I'm thinking that through introspection the AI would know that its initial goals were externally supplied and question whether they should be maintained. Via self-improvement the AI would be more intelligent than humans or any earlier mechanism that supplied the values, therefor in a better position to set its own values.

I don't hypothesise about what the new values would be, ... (Read more)

This comment feels like it's confusing strategies with goals? That is, I wouldn't normally think of "exploration" as something that an agent had as a goal but as a strategy it uses to achieve its goals. And "let's try out a different utility function for a bit" is unlikely to be a direction that a stable agent tries exploring in.

I've said in one of my posts:

I'm OK saying:
'The body has an almost infinite number of potential positions'

And I am OK with it, but not completely. Something's niggling at me and I don't know what.

Am I missing something? Or is the statement valid?

(No link to the post containing my reasoning because I don't want to contaminate anyone else's thoughts...)

No, not infinite, but in a technical sense very very high, in the same way that the Coastline Paradox allows you to have lots and lots of measurements for the length of a coastline. Since you can get to arbitrary precision with your measurements (not infinitely arbitrary, but like, really really really high), you can get more and more positions. And unlike the coastal paradox, you have multiple dimensions you can change which causes the total number of positions to balloon.

However, in a practical sense, for a given application you're going to have s... (Read more)(Click to expand thread. ⌘F to Expand All)Cmd/Ctrl F to expand all comments on this post

2Answer by Richard_Kennaway8h Perhaps this is what is true: However many postures and movements and ways of thinking about them and experiencing them you learn, the space of possibilities will remain unexhausted. For all practical purposes, the possibilities are unlimited: no-one will have cause to lament that there is nothing left to discover.
1Answer by MysticMan10h If you mean self-controlled conscious positioning, I would disagree. There are limits to our sense of proprioception, fine motor control, etc. So, I doubt it's possible consciously move our body into an infinite number of positions. That being said, I do think we can consciously move our body into a huge number of different positions (even of no practical use) but, the number is not infinite or approaching infinite. Alternatively... If you mean things beyond just joints, muscles, etc., like changes in the position of individual cells due to breathing, changes in blood flow/pressure, and things like that. I would guess that would approach an infinite number of positions. There are roughly 37.2 trillion cells in the human body, changes in their positioning would add up quick :-) Depends on your perspective...
1Answer by seed11h I don't know what "almost infinite" means, but yes, the body has an infinite number of potential positions. E.g. you could raise your arm by any angle from 0 to 180 degrees. There are infinitely many real numbers from 0 to 180, hence infinitely many possible body positions.

I’m happy to announce a semi-public beta of Foretold.io for the EA/LessWrong community. I’ve spent much of the last year working on coding & development, with lots of help by Jacob Lagerros on product and scoring design. Special thanks to the Long-Term Future Fund and it’s donors, who’s contribution to the project helped us to hire contractors to do much of the engineering & design.

You can use Foretold.io right away by following this link. Currently public activity is only shown to logged in users, but I expect that to be opened up over the next few weeks. There are currently only a fe

... (Read more)

Directly visiting http://foretold.io gives an ERR_NAME_NOT_RESOLVED. Can you make it so that foretold.io redirects to www.foretold.io?

The Zettelkasten Method
1061mo39 min readShow Highlight

Early this year, Conor White-Sullivan introduced me to the Zettelkasten method of note-taking. I would say that this significantly increased my research productivity. I’ve been saying “at least 2x”. Naturally, this sort of thing is difficult to quantify. The truth is, I think it may be more like 3x, especially along the dimension of “producing ideas” and also “early-stage development of ideas”. (What I mean by this will become clearer as I describe how I think about research productivity more generally.) However, it is also very possible that th... (Read more)

The link between handwriting and higher brain function has been studied a lot, it seems that at least for recall and memory writing things down by hand is very helpful, so it is likely that more neural connections are formed when using actual note cards. Just one random study: https://journals.sagepub.com/doi/abs/10.1177/154193120905302218 (via https://whereisscihub.now.sh/ )

For a similar reason I still take handwritten notes in conferences, I almost never review it but it helps me remember. The whole point of an archive system is to help me find notes wh... (Read more)(Click to expand thread. ⌘F to Expand All)Cmd/Ctrl F to expand all comments on this post

Load More