Welp, I guess my life is comic sans today. The EA Forum snuck some code into our deployment bundle for my account in-particular, lol: https://github.com/ForumMagnum/ForumMagnum/pull/9042/commits/ad99a147824584ea64b5a1d0f01e3f2aa728f83a
Thoughts on integrity and accountability
[Epistemic Status: Early draft version of a post I hope to publish eventually. Strongly interested in feedback and critiques, since I feel quite fuzzy about a lot of this]
When I started studying rationality and philosophy, I had the perspective that people who were in positions of power and influence should primarily focus on how to make good decisions in general and that we should generally give power to people who have demonstrated a good track record of general rationality. I also thought of power as this mostly unconstrained resource, similar to having money in your bank account, and that we should make sure to primarily allocate power to the people who are good at thinking and making decisions.
That picture has changed a lot over the years. While I think there is still a lot of value in the idea of "philosopher kings", I've made a variety of updates that significantly changed my relationship to allocating power in this way:
This was a great post that might have changed my worldview some.
Some highlights:
1.
People's rationality is much more defined by their ability to maneuver themselves into environments in which their external incentives align with their goals, than by their ability to have correct opinions while being subject to incentives they don't endorse. This is a tractable intervention and so the best people will be able to have vastly more accurate beliefs than the average person, but it means that "having accurate beliefs in one domain" doesn't straightforwardly generalize to "will have accurate beliefs in other domains".
I've heard people say things like this in the past, but haven't really taken it seriously as an important component of my rationality practice. Somehow what you say here is compelling to me (maybe because I recently noticed a major place where my thinking was majorly constrained by my social ties and social standing) and it prodded me to think about how to build "mech suits" that not only increase my power but incentives my rationality. I now have a todo item to "think about principles for incentivizing true belief...
A thing that I've been thinking about for a while has been to somehow make LessWrong into something that could give rise to more personal-wikis and wiki-like content. Gwern's writing has a very different structure and quality to it than the posts on LW, with the key components being that they get updated regularly and serve as more stable references for some concept, as opposed to a post which is usually anchored in a specific point in time.
We have a pretty good wiki system for our tags, but never really allowed people to just make their personal wiki pages, mostly because there isn't really any place to find them. We could list the wiki pages you created on your profile, but that doesn't really seem like it would allocate attention to them successfully.
I was thinking about this more recently as Arbital is going through another round of slowly rotting away (its search currently being broken and this being very hard to fix due to annoying Google Apps Engine restrictions) and thinking about importing all the Arbital content into LessWrong. That might be a natural time to do a final push to enable people to write more wiki-like content on the site.
somehow make LessWrong into something that could give rise to more personal-wikis and wiki-like content. Gwern's writing has a very different structure and quality to it than the posts on LW...We have a pretty good wiki system for our tags, but never really allowed people to just make their personal wiki pages, mostly because there isn't really any place to find them. We could list the wiki pages you created on your profile, but that doesn't really seem like it would allocate attention
Multi-version wikis are a hard design problem.
It's something that people kept trying, when they soured on a regular Wikipedia: "the need for consensus makes it impossible for minority views to get a fair hearing! I'll go make my own Wikipedia where everyone can have their own version of an entry, so people can see every side! with blackjack & hookers & booze!" And then it becomes a ghost town, just like every other attempt to replace Wikipedia. (And that's if you're lucky: if you're unlucky you turn into Conservapedia or Rational Wiki.) I'm not aware of any cases of 'non-consensus' wikis that really succeed - it seems that usually, there's so little editor activity to go around that having ...
One thing that feels cool about personal wikis is that people come up with their own factorization and ontology for the things they are thinking about...So I think in addition to the above there needs to be a way for users to easily and without friction add a personal article for some concept they care about, and to have a consistent link to it, in a way that doesn't destroy any of the benefits of the collaborative editing.
My proposal already provides a way to easily add a personal article with a consistent link, while preserving the ability to do collaborative editing on 'public' articles. Strictly speaking, it's fine for people to add wiki entries for their own factorization and ontology.
There is no requirement for those to all be 'official': there doesn't have to be a 'consensus' entry. Nothing about a /wiki/Acausal_cooperation/gwern
user entry requires the /wiki/Acausal_cooperation
consensus entry to exist. (Computers are flexible like that.) That just means there's nothing there at that exact URL, or probably better, it falls back to displaying all sub-pages of user entries like usual. (User entries presumably get some sort of visual styling, in the same way that comments o...
Btw less.online is happening. LW post and frontpage banner probably going up Sunday or early next week.
Thoughts on voting as approve/disapprove and agree/disagree:
One of the things that I am most uncomfortable with in the current LessWrong voting system is how often I feel conflicted between upvoting something because I want to encourage the author to write more comments like it, and downvoting something because I think the argument that the author makes is importantly flawed and I don't want other readers to walk away with a misunderstanding about the world.
I think this effect quite strongly limits certain forms of intellectual diversity on LessWrong, because many people will only upvote your comment if they agree with it, and downvote comments they disagree with, and this means that arguments supporting people's existing conclusions have a strong advantage in the current karma system. Whereas the most valuable comments are likely ones that challenge existing beliefs and that are rigorously arguing for unpopular positions.
A feature that has been suggested many times over the years is to split voting into two dimensions. One dimension being "agree/disagree" and the other being "approve/disapprove". Only the "approve/disapprove" dimension m...
Having a reaction for "changed my view" would be very nice.
Features like custom reactions gives me this feeling that.. language will emerge from allowing people to create reactions that will be hard to anticipate but, in retrospect, crucial. Playing a similar role that body language plays during conversation, but designed, defined, explicit.
If someone did want to introduce the delta through this system, it might be necessary to give the coiner of a reaction some way of linking an extended description. In casual exchanges.. I've found myself reaching for an expression that means "shifted my views in some significant lasting way" that's kind of hard to explain in precise terms, and probably impossible to reduce to one or two words, but it feels like a crucial thing to measure. In my description, I would explain that a lot of dialogue has no lasting impact on its participants, it is just two people trying to better understand where they already are. When something really impactful is said, I think we need to establish a habit of noticing and recognising that.
But I don't know. Maybe that's not the reaction type that what will justify the feature. Maybe it will be something we can't think of now.
Generally, it seems useful to be able to take reduced measurements of the mental states of the readers.
the language that will emerge from allowing people to create reactions that will be hard to anticipate but, in retrospect, crucial
This is essentially the concept of a folksonomy, and I agree that it is potentially both applicable here and quite important.
LessWrong has a karma system, mostly based off of Reddit's karma system, with some improvements and tweaks to it. I've thought a lot about more improvements to it, but one roadblock that I always run into when trying to improve the karma system, is that it actually serves a lot of different uses, and changing it in one way often means completely destroying its ability to function in a different way. Let me try to summarize what I think the different purposes of the karma system are:
Helping users filter content
The most obvious purpose of the karma system is to determine how long a post is displayed on the frontpage, and how much visibility it should get.
Being a social reward for good content
This aspect of the karma system comes out more when thinking about Facebook "likes". Often when I upvote a post, it is more of a public signal that I value something, with the goal that the author will feel rewarded for putting their effort into writing the relevant content.
Creating common-knowledge about what is good and bad
This aspect of the karma system comes out the most when dealing with debates, though it's present in basically any kar...
I just came back from talking to Max Harms about the Crystal trilogy, which made me think about rationalist fiction, or the concept of hard sci-fi combined with explorations of cognitive science and philosophy of science in general (which is how I conceptualize the idea of rationalist fiction).
I have a general sense that one of the biggest obstacles for making progress on difficult problems is something that I would describe as “focusing attention on the problem”. I feel like after an initial burst of problem-solving activity, most people when working on hard problems, either give up, or start focusing on ways to avoid the problem, or sometimes start building a lot of infrastructure around the problem in a way that doesn’t really try to solve it.
I feel like one of the most important tools/skills that I see top scientist or problem solvers in general use, is utilizing workflows and methods that allow them to focus on a difficult problem for days and months, instead of just hours.
I think at least for me, the case of exam environments displays this effect pretty strongly. I have a sense that in an exam environment, if I am given a question, I successfully focus my fu
Is intellectual progress in the head or in the paper?
Which of the two generates more value:
I think which of the two will generate more value determines a lot of your strategy about how to go about creating intellectual progress. In one model what matters is that the best individuals hear about the most important ideas in a way that then allows them to make progress on other problems. In the other model what matters is that the idea gets written as an artifact that can be processed and evaluated by reviews and the proper methods of the scientific progress, and then built upon when referenced and cited.
I think there is a tradeoff of short-term progress against long-term progress in these two approaches. I think many fields can go through intense periods of progress when focusing on just establishing communication between the best researchers of the field, but would be surprised if that period lasts longer than one or two decades. He...
Thoughts on minimalism, elegance and the internet:
I have this vision for LessWrong of a website that gives you the space to think for yourself, and doesn't constantly distract you with flashy colors and bright notifications and vibrant pictures. Instead it tries to be muted in a way that allows you to access the relevant information, but still gives you the space to disengage from the content of your screen, take a step back and ask yourself "what are my goals right now?".
I don't know how well we achieved that so far. I like our frontpage, and I think the post-reading experience is quite exceptionally focused and clear, but I think there is still something about the way the whole site is structured, with its focus on recent content and new discussion that often makes me feel scattered when I visit the site.
I think a major problem is that Lesswrong doesn't make it easy to do only a single focused thing on the site at a time, and it doesn't currently really encourage you to engage with the site in a focused way. We have the library, which I do think is decent, but the sequence navigation experience is not yet fully what I would like it to be, and when...
Thoughts on negative karma notifications:
The motivation was (among other things) several people saying to us "yo, I wish LessWrong was a bit more of a skinner box because right now it's so throughly not a skinner box that it just doesn't make it into my habits, and I endorse it being a stronger habit than it currently is."
That depends on what norm is in place. If the norm is to explain downvoting, then people should explain, otherwise there is no issue in not doing so. So the claim you are making is that the norm should be for people to explain. The well-known counterargument is that this disincentivizes downvoting.
you are under no obligation to waste cognition trying to figure them out
There is rarely an obligation to understand things, but healthy curiosity ensures progress on recurring events, irrespective of morality of their origin. If an obligation would force you to actually waste cognition, don't accept it!
Thoughts on impact measures and making AI traps
I was chatting with Turntrout today about impact measures, and ended up making some points that I think are good to write up more generally.
One of the primary reasons why I am usually unexcited about impact measures is that I have a sense that they often "push the confusion into a corner" in a way that actually makes solving the problem harder. As a concrete example, I think a bunch of naive impact regularization metrics basically end up shunting the problem of "get an AI to do what we want" into the problem of "prevent the agent from interferring with other actors in the system".
The second one sounds easier, but mostly just turns out to also require a coherent concept and reference of human preferences to resolve, and you got very little from pushing the problem around that way, and sometimes get a false sense of security because the problem appears to be solved in some of the toy problems you constructed.
I am definitely concerned that Turntrou's AUP does the same, just in a more complicated way, but am a bit more optimistic than that, mostly because I do have a sense that in the AUP case there is actually some meaningful reduction go
...Printing more rationality books: I've been quite impressed with the success of the printed copies of R:A-Z and think we should invest resources into printing more of the other best writing that has been posted on LessWrong and the broad diaspora.
I think a Codex book would be amazing, but I think there also exists potential for printing smaller books on things like Slack/Sabbath/etc., and many other topics that have received a lot of other coverage over the years. I would also be really excited about printing HPMOR, though that has some copyright complications to it.
My current model is that there exist many people interested in rationality who don't like reading longform things on the internet and are much more likely to read things when they are in printed form. I also think there is a lot of value in organizing writing into book formats. There is also the benefit that the book now becomes a potential gift for someone else to read, which I think is a pretty common way ideas spread.
I have some plans to try to compile some book-length sequences of LessWrong content and see whether we can get things printed (obviously in coordination with the authors of the relevant pieces).
Forecasting on LessWrong: I've been thinking for quite a while about somehow integrating forecasts and prediction-market like stuff into LessWrong. Arbital has these small forecasting boxes that look like this:
I generally liked these, and think they provided a good amount of value to the platform. I think our implementation would probably take up less space, but the broad gist of Arbital's implementation seems like a good first pass.
I do also have some concerns about forecasting and prediction markets. In particular I have a sense that philosophical and mathematical progress only rarely benefits from attaching concrete probabilities to things, and more works via mathematical proof and trying to achieve very high confidence on some simple claims by ruling out all other interpretations as obviously contradictory. I am worried that emphasizing probability much more on the site would make making progress on those kinds of issues harder.
I also think a lot of intellectual progress is primarily ontological, and given my experience with existing forecasting platforms and Zvi's sequence on prediction markets, they are not very good at resolving ontological confusions and ...
This feature is important to me. It might turn out to be a dud, but I would be excited to experiment with it. If it was available in a way that was portable to other websites as well, that would be even more exciting to me (e.g. I could do this in my base blog).
Note that this feature can be used for more than forecasting. One key use case on Arbital was to see who was willing to endorse or disagree with, to what extent, various claims relevant to the post. That seemed very useful.
I don't think having internal betting markets is going to add enough value to justify the costs involved. Especially since it both can't be real money (for legal reasons, etc) and can't not be real money if it's going to do what it needs to do.
Note that Paul Christiano warns against encouraging sluggish updating by massively publicising people’s updates and judging them on it. Not sure what implementation details this suggests yet, but I do want to think about it.
https://sideways-view.com/2018/07/12/epistemic-incentives-and-sluggish-updating/
Had a very aggressive crawler basically DDos-ing us from a few dozen IPs for the last hour. Sorry for the slower server response times. Things should be fixed now.
Random thoughts on game theory and what it means to be a good person
It does seem to me like there doesn’t exist any good writing on game theory from a TDT perspective. Whenever I read classical game theory, I feel like the equilibria that are being described obviously fall apart when counterfactuals are being properly brought into the mix (like D/D in prisoners dilemmas).
The obvious problem with TDT-based game theory, just as it is with Bayesian epistemology, the vast majority of direct applications are completely computationally intractable. It’s kind of obvious what should happen in games with lots of copies of yourself, but as soon as anything participates that isn’t a precise copy, everything gets a lot more confusing. So it is not fully clear what a practical game-theory literature from a TDT-perspective would look like, though maybe the existing LessWrong literature on Bayesian epistemology might be a good inspiration.
Even when you can’t fully compute everything (and we even don’t really know how to compute everything in principle), you might still be able to go through concrete scenarios and list considerations and perspectives that incorporate TDT-perspectives. I guess in t
...Reading through this, I went "well, obviously I pay the mugger...
...oh, I see what you're doing here."
I don't have a full answer to the problem you're specifying, but something that seems relevant is the question of "How much do you want to invest in the ability to punish defectors [both in terms of maximum power-to-punish, a-la nukes, and in terms of your ability to dole out fine-grained-exactly-correct punishment, a-la skilled assassins]"
The answer to this depends on your context. And how you have answered this question determines whether it makes sense to punish people in particular contexts.
In many cases there might want to be some amount of randomization where at least some of the time you really disproportionately punish people, but you don't have to pay the cost of doing so every time.
Answering a couple of the concrete questions:
Mugger
Right now, in real life, I've never been mugged, and I feel fine basically investing zero effort into preparing for being mugged. If I do get mugged, I will just hand over my wallet.
If I was getting mugged all the time, I'd probably invest effort into a) figuring out what good policies existed ...
Making yourself understandable to other people
(Epistemic status: Processing obvious things that have likely been written many times before, but that are still useful to have written up in my own language)
How do you act in the context of a community that is vetting constrained? I think there are fundamentally two approaches you can use to establish coordination with other parties:
1. Professionalism: Establish that you are taking concrete actions with predictable consequences that are definitely positive
2. Alignment: Establish that you are a competent actor that is acting with intentions that are aligned with the aims of others
I think a lot of the concepts around professionalism arise when you have a group of people who are trying to coordinate, but do not actually have aligned interests. In those situations you will have lots of contracts and commitments to actions that have well-specified outcomes and deviations from those outcomes are generally considered bad. It also encourages a certain suppression of agency and a fear of people doing independent optimization in a way that is not transparent to the rest of the group.
Given a lot of these drawbacks, it seems natural to aim for e...
This FB post by Matt Bell on the Delta Variant helped me orient a good amount:
https://www.facebook.com/thismattbell/posts/10161279341706038
...As has been the case for almost the entire pandemic, we can predict the future by looking at the present. Let’s tackle the question of “Should I worry about the Delta variant?” There’s now enough data out of Israel and the UK to get a good picture of this, as nearly all cases in Israel and the UK for the last few weeks have been the Delta variant. [1] Israel was until recently the most-vaccinated major country in the world, and is a good analog to the US because they’ve almost entirely used mRNA vaccines.
- If you’re fully vaccinated and aren’t in a high risk group, the Delta variant looks like it might be “just the flu”. There are some scary headlines going around, like “Half of new cases in Israel are among vaccinated people”, but they’re misleading for a couple of reasons. First, since Israel has vaccinated over 80% of the eligible population, the mRNA vaccine still is 1-((0.5/0.8)/(0.5/0.2)) = 75% effective against infection with the Delta variant. Furthermore, the efficacy of the mRNA vaccine is still very high ( > 90%) against hosp
This seems like potentially a big deal: https://mobile.twitter.com/DrEricDing/status/1402062059890786311
> Troubling—the worst variant to date, the #DeltaVariant is now the new fastest growing variant in US. This is the so-called “Indian” variant #B16172 that is ravaging the UK despite high vaccinations because it has immune evasion properties. Here is why it’s trouble—Thread. #COVID19
@Elizabeth was interested in me crossposting this comment from the EA Forum since she thinks there isn't enough writing on the importance of design on LW. So here it is.
Atlas reportedly spent $10,000 on a coffee table. Is this true? Why was the table so expensive?
Atlas at some point bought this table, I think: https://sisyphus-industries.com/product/metal-coffee-table/. At that link it costs around $2200, so I highly doubt the $10,000 number.
Lightcone then bought that table from Atlas a few months ago at the listing price, since Jonas thought the purchase ...
Since this hash is publicly posted, is there any timescale for when we should check back to see the preimage?
In an attempt to get myself to write more here is my own shortform feed. Ideally I would write something daily, but we will see how it goes.