Wiki Contributions

Comments

A low quality prior on odds of lucky alignment: we can look at the human intelligence sharp left turn from different perspectives

Worst case scenario S risk: pigs, chickens, cows

X risk: Homo florensis, etc

Disastrously unaligned but then the superintelligence inexplicably started to align itself instead of totally wiping us out: Whales, gorillas

unaligned but that's randomly fine for us: raccoons, rats

Largely aligned: Housecats

It was fun to actually get out Bayes rule in a bayesian reasoning challenge, and gratifying to see that I got the same number as reveal P(Sniper).  When I clicked "stick out helmet" a second time, I had already clicked "reveal P(sniper)" to check my work from the single shot calculation, and it live updated to the two shot calculation- spoilers?

P(S) = .3
P(H|S) = .6
P(H|^S) = .4
P(S) P(H | S) = P(H) P(S | H)
P(H) = (P(H | S) P(S) + P(H | ^S) P(^S) )
P(H) = .3 * .6 + .7 * .4 = .459999
P(S | H ) = P(S) P(H | S) / P(H) =
.3 * .6 / .459999 = 0.3913051984895619

Thanks for the link to the aiimpacts page! I definitely got the firing rate wrong by about a factor of 50, but I appear to have made other mistakes in the other direction, because I ended up at a number that roughly agrees with aiimpacts- I guessed 10^17 operations per second, and they guess .9 - 33 x 10^16, with low confidence. https://aiimpacts.org/brain-performance-in-flops/

 

Lets examine an entirely prosaic situation: Carl, a relatively popular teenager at the local highschool, is deciding whether to invite Bob to this weekend's party.

some assumptions:

  • While pondering this decision for an afternoon, Carls's 10^11 neurons fire 10^2 times per second, for 10^5 seconds, each taking in to account 10^4 input synapses, for 10^22 calculations (extremely roughly)
  • If there was some route to perform this calculation more efficiently, someone probably would, and would be more popular

The important part of choosing a party invite as the task under consideration, is that I suspect that this is the category of task the human brain is tuned for- and it's a task that we seem to be naturally inclined to spend enormous amounts of time pondering, alone or in groups- see the trope of the 6 hour pre-prom telephone call. I'm inclined to respect that- to believe that any version of Carl, mechanical or biological, that spent only 10^15 calculations on whether to invite Bob, would eventually get shrecked on the playing field of high school politics.

What model predicts that optimal party planning is as computationally expensive as learning the statistics of the human language well enough to parrot most of human knowledge?

Yeah. I suspect this links to a pattern I've noticed- in stories, especially rationalist stories, people who are successful at manipulation or highly resistant to manipulation are also highly generally intelligent. In real life, people who I know who are extremely successful at manipulation and scheming seem otherwise dumb as rocks. My suspicion is that we have a 20 watt, 2 exaflop skullduggery engine that can be hacked to run logic the same way we can hack a pregnancy test to run doom

Assuming that it was fine tuned with RLHF (which OpenAI has hinted at with much eyebrow wiggling but not to my knowledge confirmed) then it does have some special sauce. Roughly,

- if it's at the beginning of a story, 

-and the base model predicts ["Once": 10%, "It": 10%, ...  "Happy": 5% ...] 

-and then during RLHF, the 10% of the time it starts with "Once" it writes a generic story and gets lots of reward, but when it outputs "Happy"  it tries to write in the style of Tolstoy and bungles it, getting little reward

=> it will  update to output Once more often in that situation. 

The KL divergence between successive updates is bounded by the PPO algorithm, but over many updates it can shift from ["Once": 10%, "It": 10%, ...  "Happy": 5% ...] to ["Once": 90%, "It": 5%, ...  "Happy": 1% ...] if the final results from starting with Once are reliably better.

 It's hard to say if that means it's planning to write a generic story because of an agentic desire to become a hack and please the masses, but certainly it's changing its output distribution based on what happened many tokens in the future

I think there is useful signal for you that the entire comments section is focused on the definition of a word instead of reconsidering whether specific actions or group memberships might be surprisingly beneficial. This is a property of the post, not the commenters. I suspect the issue is that people already emotionally reacted to the common definition of the word Religion in the title, before you had a chance to redefine it in the body.
The redefinition step is not necessary either- the excellent "Exercise is good" and "Nice clothes are good" posts used the common definitions of Exercise and Nice clothes throughout. 

We’ve got a bit of a selection bias: anything that modern medicine is good at treating (smallpox, black plague, scurvy, appendicitis, leprosy, hypothyroidism, deafness, hookworm, syphillis) eventually gets mentally kicked out of your category “things it deals with” since doctors don’t have to spend much time dealing with them.

So i have not actually watched any jordan perterson videos, only been told what to believe about him by left wing sources. Your post gave me a distinctly different impression than I got from them! I decided to suppress my gut reaction and actually see what he had to say.

To get a less biased impression of him, I picked a random video on his channel and scrolled to the middle of the timeline. The very first line was "Children are the sacrificial victims of the trans ideology."

What are the odds of that?

I think this is complicated by the reality that money given to the parties isn't spent directly on solving problems, but on fighting for power. The opinion that "the political parties should have less money on average, and my party should have relatively more money than their party" seems eminently reasonable to me. 

Load More