Stephen Bennett (Previously GWS)


Sorted by New

Wiki Contributions


Austria is also the player instigating a plan of action in the dialogue, which seems to be how the AI is so effective. It seems like the way it wins because it proposes mutually beneficial plans and then (mostly) follows through on them.

Does putting the pasta in the kettle gunk it up? They're a bit hard to clean and I wouldn't want starchy water in my tea.

Overall I like the survey, especially its brevity. I think there's a way to make this question fit more naturally:

Above we're asking about attendance, but safety, comfort, and enjoyment are also important. If you answered "about the same" for any above but would feel better or worse about it, could you tell us more? [free text]

I'd change it to something like "We're interested in more than just attendance. If we changed any of the above policies, how would it affect your safety, comfort, and enjoyment at the dances?"

This reminds me of Scott Alexander's post on Pascalian Medicine:

The gist of that post is that if you take dozens-hundreds of medications for something, each of which is unlikely to work but carries basically no risk, the net effect of the medicines is in theory positive. I think this is probably false for acute illnesses, but might be true for chronic illnesses since the net damage of the illness is larger (barring consequences like death from acute illness) and the odds of a negative interaction between medications is lower when you're only taking a handful at a given time (again, so long as nothing kills you - I don't know how to account for that because I don't know how likely it is). As a result, while for each medication you take you might have negative expectations for that dose - a couple of random side effects with no effect on your condition - the information you gain means that the lifetime consequences of taking the dose are positive. I have no idea how to even guess at the frequency that a given medication will solve a given problem through pure luck, which is unfortunately the central number in this calculation.

I'm glad you found something that works for you!

It's nice to see that Katja is pretty well calibrated. Congratulations to her!

I remember listening to a podcast that had Daniel Khaneman on as a guest. The host asked Daniel (paraphrasing) 'Hey, so people have all these biases that keep them from reasoning correctly. What could I do do to correct them?', and Daniel responded 'Oh, there's no hope there. You're just along for the ride, system 1 is going to do whatever it wants' and I just felt so defeated. There's really no hope? There's not a way that we might think more clearly. I take this as a pretty big success, and a nice counterexample to Danny's claim that people are irredeemably irrational.

I haven't worked on any browser extensions before (not sure what language they're written in), but I do know javascript well enough. We can probably work something out!

Using javascript it'd be pretty easy to dynamically hide/show numbers depending on whether or not the text is a member of a certain javascript class. In this case, the class of interest is "mb-0.5 mr-0.5 text-2xl" (which doesn't show up anywhere else on the main page).

More generally, the extension could allow users to mark numbers as "I want to see numbers that show up in places like this" or "I don't want to see numbers that show up in places like this" and then create a rule to show/hide all numbers in that class. If you create a central database of these votes, you can then extend this across users, so that when someone comes across a new website numbers that they want to see are shown automatically.

Of course, sometimes classes aren't enough, but in the case of Manifold it seems like it'd be sufficient (users could vote on whether or not to see text with the class "text-base text-red-500" and green (for recent shifts in market probability).

One downside to this approach is that it would break if the page updates its javascript classes, but if it's crowdsourced it would probably get fixed pretty quickly and only impact a minority of users.

I don't misclassify every 1 in 1000 objects I see in daily life

Perhaps this is too nitpicky about semantics, but when I tried to evaluate the plausibility of this claim I came into a bit of an impasse.

You're supposing that there is a single natural category for each object you see (and buries even the definition of an object, which isn't obvious to me). I'd agree that you probably classify a given thing the same way >99.9% of the time. However, what would the inter-rater reliability be for these classifications? Are those classifications actually "correct" in a natural sense?

I employ the entire scale depending on the thing that is making the dish dirty. For the most part I employ 5, but I do acknowledge that some substances simply require abrasion. Peanut butter without added sugar, for example, is basically completely impervious to the efforts of any dishwasher I've ever used, so it gets a 1.

Your initial point was that "goals" aren't a quantifiable thing, and so it doesn't make sense to talk about "orthogonality", which I agree with. I was just saying that while goals aren't quantifiable, there are ways of quantifying alignment. The stuff about world states and kendall's tau was a way to describe how you could assign a number to "alignment".

When I say world states, I mean some possible way the world is. For instance, it's pretty easy to imagine two similar world states: the one that we currently live in, and one that's the same except that I'm sitting cross legged on my chair right now instead of having my knee propped against my desk. That's obviously a trivial difference and so gets nearly exactly the same rank as the world we actually live in. Another world state might be one in which everything is the same except that a cosmic ray has created a prion in my brain (which gets ranked much lower than the actual world).

Ranking all possible future world states is one way of expressing an agent's goals, and computing the similarity of these rankings between agents is one way of measuring alignment. For instance, if someone wants me to die, they might rank the Stephen-has-a-prion world quite highly, whereas I rank it quite low, and this will contribute to us having a low correlation between rank orderings over possible world states, and so by this metric we are unaligned from one another.

Load More