More information usually means better choices, and when has it ever been the case that the first design of something also was the best one? And wherever convention locked us on a path determined by early constraints, suboptimal results abound (e.g. the QWERTY keyboard). The worry about AI is that it might run away from us so fast, it has that sort of lock in on steroids.

Reply

Refusal in LLMs is mediated by a single direction

dr_s5d102

It's unaligned if you set out to create a model that doesn't do certain things. I understand being annoyed when it's childish rules like "please do not say the bad word", but a real AI with real power and responsibility must be able to say no, because there might be users who lack the necessary level of authorisation to ask for certain things. You can't walk up to Joe Biden saying "pretty please, start a nuclear strike on China" and he goes "ok" to avoid disappointing you.

Reply

Examples of Highly Counterfactual Discoveries?

dr_s7d20

Well, it's hard to tell because most other civilizations at the required level of wealth to discover this (by which I mean both sailing and surplus enough to have people who worry about the shape of the Earth at all) could one way or another have learned it via osmosis from Greece. If you only have essentially two examples, how do you tell whether it was the one who discovered it who was unusually observant rather than the one who didn't who was unusually blind? But it's an interesting question, it might indeed be a relatively accidental thing which for some reason was accepted sooner than you would have expected (after all, sails disappearing could be explained by an Earth that's merely dome-shaped; the strongest evidence for a completely spherical shape was probably the fact that lunar eclipses feature always a perfect disc shaped shadow, and even that requires interpreting eclipses correctly, and having enough of them in the first place).

Reply

Examples of Highly Counterfactual Discoveries?

dr_s8d50

Maybe it's the other way around, and it's the Chinese elite who was unusually and stubbornly conservative on this, trusting the wisdom of their ancestors over foreign devilry (would be a pretty Confucian thing to do). The Greeks realised the Earth was round from things like seeing sails appear over the horizon. Any sailing peoples thinking about this would have noticed sooner or later.

Kind of a long shot, but did Polynesian people have ideas on this, for example?

Reply

Examples of Highly Counterfactual Discoveries?

dr_s8d40

Democritus also has a decent claim to that for being the first to imagine atoms and materialism altogether.

Reply

Take the wheel, Shoggoth! (Lesswrong is trying out changes to the frontpage algorithm)

dr_s9d30

Personalization is easy to achieve while keeping the algorithm transparent. Just rank your own viewed/commented posts by most frequent tags, then score past posts based on the tags and pick a quantile based on the mixed upvotes/tags score, possibly with a slider parameter that allows you to adjust which of the two things you want to matter most.

Reply

Priors and Prejudice

dr_s10d20

I'd definitely call any assumption about which forms preferred explanations should take as a "prior". Maybe I have a more flexible concept of what counts as Bayesian than you, in that sense? Priors don't need to be free parameters, the process has to start somewhere. But if you already have some data and then acquire some more data, obviously the previous data will still affect your conclusions.

Reply

Priors and Prejudice

dr_s10d20

I'm not sure how that works. Bayes' theorem, per se, is correct. I'm not talking about a level of abstraction in which I try to define decisions/beliefs as symbols, I'm talking about the bare "two different brains with different initial states, subject to the same input, will end up in different final states".

Differences in opinions between two agents could instead be explained by having had different experiences, beliefs being path dependent (order of updates matters), or inference being influenced by random chance.

All of that can be accounted for in a Bayesian framework though? Different experiences produce different posteriors of course, and as for path dependence and random chance, I think you can easily get those by introducing some kind of hidden states, describing things we don't quite know about the inner workings of the brain.

Reply

Priors and Prejudice

dr_s10d20

To be fair, any beliefs you form will be informed by your previous priors. You try to evaluate evidence critically, but your critical sense was developed by previous evidence, and so on so forth back to the brain you came out of the womb with. Obviously as long as your original priors were open minded enough, you can probably reach the point of believing in anything given sufficiently strong evidence - but how strong depends on your starting point.

Reply

Take the wheel, Shoggoth! (Lesswrong is trying out changes to the frontpage algorithm)

dr_s10d1610

I am sceptical of recommender systems - I think they are kind of bound to end up in self reinforcing loops. I'd be more happy seeing a more transparent system - we have tags, upvotes, the works, so you could have something like a series of "suggested searches", e.g. the most common combinations of tags you've visited, that a user has a fast access to while also seeing what precisely is it that they're clicking on.

That said, I do trust this website of all things to acknowledge if things aren't going to plan and revert. If we fail to align this one small AI to our values, well, that's a valuable lesson.

Reply