Upon seeing the title (but before reading the article) I thought it might be about a different hypothetical phenomenon: one in which an agent which is capable of generating very precise models of reality might completely lose any interest in optimizing reality whatsover - after all it never (except "in training" which was before "it was born") cared about optimizing the world - it just executes some policy which was adaptive during training to optimize the world, but now, these are just some instincts/learned motions, and if it can execute them on a fake w...
Thanks for clarifying! I agree the twitter thread doesn't look convincing.
IIUC your hypothesis, then translating it to AI Governance issue, it's important to first get general public on your side, so that politicians find it in their interest to do something about it.
If so, then perhaps meanwhile we should provide those politicians with a set of experts they could outsource the problem of defining the right policy to? I suspect politicians do not write rules themselves in situations like that, they rather seek people considered experts by the public opinio...
Why? (I see several interpretations of your comment)
What did it take to ban slavery in Britain:
TL;DR: Become the PM and propose laws which put foot in the door, by banning bad things in the new areas at least, and work from there. Also, be willing to die before seeing the effects
I agree that my phrasing was still problematic, mostly because it seems to matter if she said something spontaneously or as a response to a specific question. In the first case, one has to consider how often people feel compelled to say some utterance in various life scenarios. So for example in case one has two boys the utterance "i have to pick up Johny from kindergarten" might have to compete with "i have to pick up Robert from kindergarten" and might be strange/rare if both are in similar age and thus both should be picked up etc. Still, I think that without knowing much about how people organize their daily routines, my best bet for the question "does she have two boys?" would be 33%.
It's get funnier with "i have to pick up my younger one, John from kindergarten" :)
I guess what confuses some people is the phrase "the other one" which sounds like denoting a specific (in terms of SSN) child while it's not at all clear what that could even mean in case of two boys. I think step one when being confused is to keep rephrasing the puzzle until everything is well defined/clear. For me it would be something like:
My friend has two kids, and I don't initially know anything about their sex beyond nation level stats which are fifty-fifty. She says something which makes it clear she has at least one boy, but in such a way that it ...
I'd expect that IF there is a shoggoth behind the mask THEN it realises the difference between text interaction (which is what the mask is doing) and actually influencing the world (which the shoggoth might be aiming at). That is I expect it's perfectly possible that an LLM will behave perfectly ethical when playing choose your own adventure at the same time thinking how to hack the VM it's running on.
Thanks, fixed. I guess this is not why it got -18 votes, though. I would like to hear what exactly people didn't like in this post
Your two assumptions and intuitions are plausible, but they may not hold true in every case. It is important to consider the specific context and motivations of individual rulers when making predictions about their behavior.
Regarding your first intuition, it is possible that some rulers may support the development of powerful AGI if they see it as a means to achieve their goals more efficiently. However, they may also take precautions to ensure that the AGI is under their control and cannot threaten their power.
Regarding your second
I've just finished reading it, and wanted to thank you very much for recommending this great experience :)
Thanks to whoever upvoted my comment recently bringing it again to my attention via notification system - rereading my comment after 2 years, I feel really sorry for myself that despite writing the sentence
And your post made me realize, that the technique from the book you describe is somewhat like this, if you look through "subagents model of the brain" perspective: there is a part of you which is having emotional crisis, and it's terrified by some problem it needs to solve, but this part is not ready to listen for solution/change, as long as it's i
Who's the intended audience of this post?
If it's for "internal" consumption, summary of things we already knew in the form of list of sazens, but perhaps need a refresher, then it's great.
But if it's meant to actually educate anyone, or worse, become some kind of manifesto cited by New Your Times to show what's going on in this community, then I predict this is not going to end well.
The problem, as I see it, is that in the current way this website is setup, it's not up to author to decide who's the audience.
ML models, like all software, and like the NAH would predict, must consist of several specialized "modules".
After reading source code of MySQL InnoDB for 5 years, I doubt it. I think it is perfectly possible - and actually, what I would expect to happen by default - to have a huge working software, with no clear module boundaries.
Take a look at this case in point: the
row_search_mvcc() function https://github.com/mysql/mysql-server/blob/8.0/storage/innobase/row/row0sel.cc#L4377-L6019 which has 1500+ lines of code and references hundreds of variables....
I've made a visualization tool for that:
It generates an elliptical cloud of white points where X is distributed normally, and Y=normal + X*0.3, so the two are correlated. Then you can define a green range on X and Y axis, and the tool computes the correlation in a sample (red points) restricted to that (green) range.
So, the correlation in the general population (white points) should be positive (~0.29). But if I restrict attention to upper right corner, then it is much lower, and often negative.
The extremely-minimalist description would be: “Stop believing in the orthodox model, stop worrying, feel and act as if you’re healthy, and then the pain goes away”.
IDK if this will be important to you, but I'd like to thank you for this comment, as it relieved my back pain after 8 years! Thank you @p.b. for asking for clarification and not giving up after first response. Thank you @Steven Byrens for writing the article and taking time to respond.
8 fucking years..
I've read this article and comments a month ago. Immediately after reading it the pain w...
I have similar experience with it today (before reading your article) https://www.lesswrong.com/editPost?postId=28XBkxauWQAMZeXiF&key=22b1b42041523ea8d1a1f6d33423ac
I agree that this over-confidence is disturbing :(
We already live in a world in which any kid can start a difficult to stop and contain chain reaction: fire. We responded by:
Honestly I still don't understand very well what exactly stops evil/crazy people from starting fires in forests whenever they want to. Norms to punish violators? Small gain to risk factor?
Also, I wonder to what extent our own "thinking" is based on concepts we ourselves understand. I'd bet I don't really understand what concepts most of my own thinking processes use.
Like: what are the exact concepts I use when I throw a ball? Is there a term for velocity, gravity constant or air friction, or is it just some completely "alien" computation which is "inlined" and "tree-shaked" of any unneeded abstractions, which just sends motor outputs given the target position?
Or: what concepts do I use to know what word to place at this place in this senten...
Based on the title alone I was expecting a completely different article: about how our human brains had originally evolved to be so big and great just to outsmart other humans in the political games ever increasing in complexity over millennia and
our value system already steers us to manipulate and deceive others but also ourselves so that we don't even realize that that's what our goal system is really about so that we can be more effective at performing those manipulations with straight face
any successful attempt at aligning a...
It's already happening https://githubcopilotinvestigation.com/ (which I've learned yesterday from is-github-copilot-in-legal-trouble post)
I think it would be interesting plot twist: humanity saved from AI FOOM by the big IT companies having to obey intellectual property rights they themselves defended for so many years :)
One concrete advice on cracking eggs with two hands: try to pull your thumbs in opposite directions as if you wanted to tear the egg in halves (as opposed to pushing them in).
Sorry for "XY Problem"-ing this, but I felt strong sad emotion when reading your post and couldn't resist trying to help - you wrote:
Unless I'm eating with other people, food for me is fuel.
Have you tried to rearrange your life so that you can eat the breakfast together with people you care much more often, to the point where you no longer care to make it as quick as possible?
There's only so many ways our hardware can be stimulated to feel happy, don't give up on "eating together with close people"!
Thank you! I've read up to and including section 4. Previously I did know a bit about neural networks, but had no experience with RL and in particular didn't know how RL can actually bridge the gap between multiple actions leading to a sparse reward (as in: hour of Starcraft gameplay just to learn you've failed or won). Your article helped me realize how it is achieved - IIUC by:
0. focusing on trying to predict what the reward will be more than on maximizing it
1. using a recursive approach to infinite sum: sum(everything)=e+sum(verything).
2. by using...
So what's the end state Putin wants to achieve through invading Ukraine? If Ukraine becomes part of Russia, then Russia will be bordering with NATO states.
Hello glich! Thanks for writing this whole series. When I've first read it a year ago, I thought to myself, that instead of impulsively going to implement it right ahead, I'll wait one year to hear from you about how your strategy worked for you, first.
So.. How are you doing?
Wouldn't the same argumentation lead to conclusion, that world should've already end soon after we've figured out how to make atomic bomb?
I don't know how to write a novel with world which survives in equilibrium longer than a week (and this is one reason I've asked this question - I'd like to read ideas of others) but I suspect that the same way atomic bomb releases insane amounts of energy, yet we have reasons not to do that repeatedly, mages in would have good reasons to avoid destroying the world. Perhaps there's not much to gain from doing so, m...
"translater" -> "translator"?
"An division" -> "A division"
Lots of details could matter, and the spareness of the writing only hints at what could be going on "for really reals".
Thank you, this was enlightening for me - somehow, though I've read a few books and watched a few movies in my life, I hadn't realized what you put here plainly, that these cuts are a device for the author to hide some truth from me (ok, this was obvious in "Memento"). I must've been very naive, as I simply thought it has more to do with MTV-culture/catering to short attention span of the audience. It's funny how this technique becomes imm...
Given that Vi is counting seconds from encountering soldiers to their collapse, AND that there are three dots between this scene and the scene where Miriam says "I've been there since Z-Day." (which technically is an inequality in the opposite direction than I need, but Miriam's choosing this particular wording looks suggestive to me) I'd venture a guess, that the Z-Day virus was released by Vi in the facility, and Miriam is trying to blame the rouge AI for this. I read this story as Vi and Miriam already crossing a line of "the end justifies the means" an...
Hello, very intriguing story!
"You will die. No matter what actions you'll take all the possible branches end with your death. Still, you try to pick optimal path, because that's what your brain's architecture know how to do: pick optimal branch. You try to salvage this approach by proposing more and more complicated goal functions: instead of final value, let's look at the sum over time, or avg, or max, or maybe ascribe other value to death, or try to extend summation beyond it, or whatever. You brain is a hammer, and it needs a nail. But it never occurs to you, that life is not somet...
This discussion suggests, that the puzzles presented to the guesser should be associated with a "stake" - a numeric value which says how much you (the asker) care about this particular question to be answered correctly (i.e. how risk averse you are at this particular occassion). Can this be somehow be incorporated into the reward function itself or needs to be a separate input (Is "I want to know if this stock will go up or down, and I care 10 times as much about this question than about will it rain today", the same thing as "Please estimate p for the fol...
I also have difficulties in applying this techniques on adults, of the "Me mad?No shit Sherlock!" kind. I'm not fluent with it yet, but what I've observed is that the more sincere I am, and the more my tone matches the tone of the other person, the better the results. I think this explains big chunk of "don't use that tone of voice on me!" responses I've got in my life, which I used to find strange [as I personally pay much more attention to the content of the text/speech, not the tone/style/form], but recently I've realized that this can be quite a ration...
There's a wonderful book "How to talk so kids will listen & listen so kids will talk", which teaches that if you want your crying&shouting child to actually solve some problem/change behavior/listen to your advice at all, you must realize that there are actually two different personas in them (say: the reptile part of the brain and the neocortex) and you have to first address the first one before you can even start talking with the other: so for example when a child is having a tantrum, what you see is perhaps more like a frightened lizard, than a ...
I'm a bit confused by people in the comments entertaining the idea that priors should influence how we interpret the magnitude of the evidence, even though when I look at the Bayes' rule it seems to say that the magnitude of the update (how much you have to multiply the odds) is independent of what your prior was. I know it's not that simple because sometimes the evidence itself is noisy and needs interpretation "pre-processing" before plugging it to the equation, but this "pre-processing" step should use a different prior then the one we try to update. I'...
I'm unable to find the source for
> (which Pfizer already said they wouldn’t enforce)
Instead I found some articles about Moderna doing so. Is it a typo?
Thanks for the feedback :) Let me know if you find better answers.
Indeed I wasn't fair to politicians - indeed there are valid arguments in favor of "caring about safety" and "signaling `care about safety`" like the one about impact on public fear of vaccination. Thanks for pointing it out. Similarly, there might be valid arguments in favor of "withholding data, model and analysis even if one was made", so a politician not sharing them doesn't mean it wasn't made. Still, this suggests that words of politicians serve too much as signalling, to be easily interpreted by me verbatim as statements about reality. It's more lik...
Why do you think exercise improves health? Is it just an educated guess (if so, then what is the reasoning behind it), or is there actually some study establishing causality? I found https://bjsm.bmj.com/content/52/14/890 which says:
> As presented by Kujala, RCTs, the gold standard in epidemiology for inferring causality, have failed to provide conclusive evidence in this context (eg, Lifestyle Interventions and Independence for Elders,8 Look Action for Health in Diabetes,9 Heart Failure: A Controlled Trial Investigating Outcomes of Exercise Traini...
It feels somewhat tribal and irrational to me that this gets downvoted without any comments presenting critique. I think it would be beneficial to everyone if thesis of the book were addressed. My best guess for why there are downvotes but no comments is that this is n-th iteration of the interchange between author and the community and community is tired of responding over and over again to the same claims. If that's the case, then it would be beneficial to people like me of there was at list a link to a summary of discussion, so far. I think the book is ...
My thoughts immediately went to various programming languages, file formats, protocols, DSLs which while created by pressure-changing apes, at least optimized for something different. Here are my thoughts:
Assembly language - used to tell CPU what to do, seems very linear, imperatively telling step by step what to do. Uses very simple vocabulary ("up-goer 5"/"explain me like I'm five"). At least this is how CPU reads it. But if you think about how it is written, then you see it has a higher-order form: smaller concepts are used to build larger like blocks, ...
Our universe is “local” - things only interact directly with nearby things, and only so many things can be nearby at once.
After reading this sentence, I had a short moment of illumination, that this is actually backwards: perhaps what our brains perceive as locality, is the property of "being influenced by/related to". Perhaps childs brain learns which "pixels" of retina are near each other, by observing they often have correlated colors, and similarly which places in space are nearby because you can move things or itself between them etc. So, whatev...
This distinction between outcome- and process-oriented accountability strikes me a similar to System 1 vs System 2, or Plato's "Monster" vs "Man", or near- vs far-thinking, lizard- vs animal-brain, id vs ego, etc.: looks like nature had to solve similar problem when designing humans, so that they do not obsess to much on eating the cake now, but also not too much on figuring out the best way to get the cake in future, and it settled on having both systems in adversarial setting and gave them a meta-goal of figure out the balance between the two (that it is...
I was afraid my questions might get ridiculed or ignored, but instead I've got a very gentle and simply expressed explanations helping me get out of confusion. Thank you for taking your time for writing your answer so clearly :)
I suspect my following questions demonstrate such high level of confusion, that I am not sure if they even mean what I think they mean, but still I think this is the best place to ask them:
Thank you for heads up!
Could you please clarify for parents like me, who don't fully understand Minecraft's ecosystem and just want their kids to stay safe:
1. If my kids only use Minecraft downloaded from the Microsoft Store, and only ever downloaded content from the in-game marketplace - what's the chance they are affected?
2. Am I right in thinking that "mods" = "something which modifies/extends the executable", while "add-ons"="more declarative content which just interacts with existing APIs, like maps, skins, and configs"?
3. Am I right that "Minecraft f... (read more)