People generally care more about furthering personal pleasure and minimizing personal pain than the pleasure/pain of others; but this is because internal personal pleasure was a straightforward good heuristic for evolution to take when it wanted to maximize genetic fitness in the ancestral environment where there weren't that many sudden out-of-distribution things (like contraceptives) that could derail it.
I assume a more strongly-optimized intelligent being would have increasingly better correlation between the state of its internal utility to the state of the external world, as it fits whatever goal it was optimized for better. In that case it should more readily collaborate with its clone.
This especially if it gets optimized with other instances of itself so that "cloning" is no longer a weird out-of-distribution event; in which case I expect it to rapidly start behaving like an ant or bee, or even cell or mitochondria, in how it'll sacrifice itself for whatever goal the group has.
Ah, thanks, I should have read more carefully.
My revised thoughts is that by asking it if the question is a real-world question as opposed to an evaluation bumps the prior from "almost certainly not an evaluation" to "the likely fraction of questions I'm asked that are evaluations : real-world, e.g. 1:1 unless explicitly stated". That, combined with the presence or absence of suspicious elements like an ethical dilemma may move the likelihood to the one we see it claim.
Ideally I'd want to see it independently coming up with the evaluation hypothesis itself without having to remind it that it's a possibility.
Surely the basic fact that a model is asked if it thinks it's in an evaluation is extremely strong evidence for it actually being evaluated? Models are more or less never asked that normally, so it'd be very strong evidence that they are. The reasons it gives could then be post-hoc justifications.
I imagine most disagreement comes from the first paragraph.
The problem with assuming that since the publisher is famous their design is necessarily good is that even huge companies make much worse baffling design decisions all the time, and in this case one can directly see the design and know that it's not great – the weak outside-view evidence that prestigious companies usually do good work doesn't move this very much.
The "lightcone-eating" effect on the website is quite cool. The immediate obvious idea is to have that as a background and write the title inside the black area.
If one wanted to be cute you could even make the expansion vaguely skull-shaped; perhaps like so?
I worry that if I remap it to something actually useful I will commit it to muscle memory and begin to inadvertently press it when using a computer that's not my own. Depending on how often you switch computers this could be worse than the status quo.
This issue also shows up when doing surveys to compare support for things across countries.
Here, for example, is a typical example one might find on social media where the connotation of the question might vary wildly depending on the language it's translated to. Reasoning about modest differences in percentage between countries then becomes rather meaningless.
Yeah. An even more obvious example would be something like "what would Spock say if reviewing 'Warp Drives for Dummies'". In that case, it seems pretty clear that the author is expected to invent some "hallucinatory" content for the book, and not output something like "I don't know that one".
The actual examples can be interpreted similarly; the author should assume that the movie/book exists in the hypothetical counterfactual world they are asked to generate content from.
Dream jobs around the world. America’s is still pilot. Weird, because there is a shortage of pilots. Oh, right, insane licensing requirements and lousy pay. Makes sense.
The methodology of that was rather questionable; they looked at the Google search volume of "how to be a {job}". Presumably this biases it heavily to jobs where people are curious about the training and/or accreditation process, and not necessarily things people want to be.
Another issue with it is that it's in English, so outside the UK & USA it's mostly measuring expats, tourists, and the young/educated people that search for things in English.
A copy of the movie Nukie – only graded at 8.5 out of 10 – sold for $80k after they destroyed over 100 other copies
The people who sold that tape are popular YouTubers, and donated the proceeds to charity. You'd presumably not get anywhere close to that sum if you were just a random collector.
As for the other expensive collectors items like the video tapes and games, I assume they are set up (or even straight up wash trades) by the auction house in collaboration with grading companies; they want the free publicity so people will go and spend money grading their old games in the hopes of making a bunch of money.
That's why it's always items that "everybody" had that are sold in those high-profile auctions, like Super Mario and Back to the Future. They want people to go "Hey, I have that video game" and rush to spend hundreds of dollars on grading.
This is indeed very much the obvious failure mode! Discovering that an alien species has bred a group of humans into what a pug is to a wolf would be absolutely horrific.
Moreover the path between utopia and "lovecraftian horror" seems pretty fragile? I don't know exactly what property cats had that made the shoggoth take the good one (mostly, maybe except for those flat-faced Persian and hairless Sphynxes) for them, and it's plausible it was just a lucky combination of minor stuff (harder to selectively breed, different social niche, different types of people liking cats) that won't be stable/generalize in extremis.