All Posts

Sorted by New

Friday, June 16th 2023
Fri, Jun 16th 2023

Shortform
8johnswentworth9h
Consider two claims: * Any system can be modeled as maximizing some utility function, therefore utility maximization is not a very useful model * Corrigibility is possible, but utility maximization is incompatible with corrigibility, therefore we need some non-utility-maximizer kind of agent to achieve corrigibility These two claims should probably not both be true! If any system can be modeled as maximizing a utility function, and it is possible to build a corrigible system, then naively the corrigible system can be modeled as maximizing a utility function. I expect that many peoples' intuitive mental models around utility maximization boil down to "boo utility maximizer models", and they would therefore intuitively expect both the above claims to be true at first glance. But on examination, the probable-incompatibility is fairly obvious, so the two claims might make a useful test to notice when one is relying on yay/boo reasoning about utilities in an incoherent way.
3
3kuira9h
Sometimes I have an internal desire different to do something different than what I think should be done (for example, I might desire to play a game while also thinking the better choice is to read). I've been experimenting with using randomness to mediate this. I keep a D20 with me, give each side of the dispute some odds proportional to the strength of its resolve, and then roll the die. In theory, this means neither side will overpower the other, and even a small resolve still has a chance. I'm not sure how useful this is, but it's fun, and can sort of give me motivation (I've tried to internalize this kind of roll as a rule not to break without good reason). Also, when I'm merely deciding between some options, sometimes I'll roll more casually with equal odds, and it'll help me realize that I already know which it is I really wanted to do (if I don't like the roll's outcome).
2NicholasKross7h
In response to / inspired by this SSC post [https://astralcodexten.substack.com/p/your-incentives-are-not-the-same]: I was originally going to comment something about "how do I balance this with the need to filter for niche nerds who are like me?", but then I remembered that the post is actually literally about dunks/insults on Twitter. o_0 This, in meta- and object-level ways, got to a core problem I have: I want to do smart and nice things with smart and nice people, yet these (especially the social stuff) requires me to be so careful + actually have anything like a self-filter. And even trying to practice/exercise that basic self-filtering skill feels physically draining. (ADHD + poor sleep btw, but just pointing these out doesn't do much!) To expand on this (my initial comment [https://astralcodexten.substack.com/p/your-incentives-are-not-the-same/comment/17376134]): While I love being chill and being around chill people, I also (depending on my emotional state) can find it exhausting to do basic social things like "not saying every thought that you think" and "not framing every sentence I say as a joke". I was once given the "personal social boundaries" talk by some family members. One of them said they were uncomfortable with a certain behavior/conversational-thing I did. (It was probably something between "fully conscious" and "a diagnosable tic".). And I told them flat-out that I would have trouble staying in their boundary (which was extremely basic and reasonable of them to set, mind you!), and that I literally preferred not-interacting-with-them to spending the energy to mask. Posts like this remind me of how scared of myself I sometimes am, and maybe should be? I'm scared and of being either [ostracized by communities I deeply love] or [exhausting myself by "masking" all the time]. And I don't really know how to escape this, except by learned coping mechanisms that are either (to me) "slowly revealing more of myself and being more casual, in proport
2Douglas_Knight13h
Someone just told me that the solution to conflicting experiments is more experiments. Taken literally this is wrong: more experiments just means more conflict. What we need are fewer experiments. We need to get rid of the bad experiments. Why expect that future experiments will be better? Maybe if the experimenters read the past experiments, they could learn from them. Well, maybe, but maybe if you read the experiments today, you could figure out which ones are bad today. If you don't read the experiments today and don't bother to judge which ones are better, what incentive is there for future experimenters to make better experiments, rather than accumulating conflict?
1
1Dalcy Bremin1h
What's a good technical introduction to Decision Theory and Game Theory for alignment researchers? I'm guessing standard undergrad textbooks don't include, say, content about logical decision theory. I've mostly been reading posts on LW but as with most stuff here they feel more like self-contained blog posts (rather than textbooks that build on top of a common context) so I was wondering if there was anything like a canonical resource providing a unified technical / math-y perspective on the whole subject.

Thursday, June 15th 2023
Thu, Jun 15th 2023

Shortform
1Anton Zheltoukhov1d
A THOUSAND NARRATIVES. THEORY OF MEMETIC EVOLUTION. PART 1/20. INTRO The ultimate goal of this line of research is to gain a better understanding of how human value system operates. The problem I see regarding current approaches to studying values is that we cannot study {values/desires/preferences} in isolation from the rest of cognitive mechanisms, cause according to latest theories values are just a part of a broader system governing behaviour in general. With that you have to have a decent model of human behaviour first to then be able to explain value dynamics. To get a good theory of the mind you have to meet multiple requirements: 1. A good theory of the mind must span at least four different timescales: (genetic evolution) for the billion years in which our brains have evolved; (memetic evolution) for the centuries of cultural accumulation of ideas through history; (personal) for the individual development during lifetime; and (neuronal) milliseconds during which cognitive inference happens. 2. A good theory must explain behaviour of the system on each of Marr’s three levels of analysis[1]: (1) the computational problem the system is solving; (2) the algorithm the system uses to solve that problem; and (3) how that algorithm is implemented in the “physical hardware” of the system. And, the part I think Marr is missing, the third level also has to include explanation of how the learning environment affects agent [https://www.lesswrong.com/posts/RCbofC8fCJ6NnYti7/intro-to-ontogenetic-curriculum]. 3. A good theory must at least make an attempt at answering the main questions: how is the generality of intelligence achieved?; what is the neural substrate of memory?; etc. To meet these requirements I’ve combined insights from several fields: Developmental Psychology, Neuroscience, Ethology and Computation models of mind. The result is the Narrative Theory. The research is still far from completion but there ar
Wiki/Tag Page Edits and Discussion

Wednesday, June 14th 2023
Wed, Jun 14th 2023

Shortform
8Portia2d
ACCURATELY ASSESSING SEX-RELATED CHARACTERISTICS SAVES LIVES. CAN WE MAKE IT FAIR TO ALL HUMANS, WOMEN, MEN, TRANS AND INTER FOLKS? A NERDY IDEA. Sex-related characteristics are medically relevant; accurately assessing them saves lives. But neither assigned sex nor gender identity alone properly capture them. Is anyone else interested in designing a characteristic string instead, so all humans, esp. all women and gender diverse folks, get proper medical care? This idea started yesterday, when I had severe abdominal pain, and started googling. Eventually, I reached sites that listed various potential conditions. Some occur in all people (e.g., stomach ulcers), albeit often not with the same presentation and frequency; others have very specific sex-based requirements (e.g. overian cyst, or testicular torsion). Some webpages introduced ovary-related things as “In women, it can also be…” Well, I thought - I highly doubt my trans girlfriend has an ovarian cyst. But we are used to getting medical advice that does not fit for her, aren't we? (In retrospect, why did I think that was okay, just because it was so common?) Other sites, apparently wanting to prevent this, stated “we use female in this text to refer to people assigned female at birth”. I was happy that they had thought about this and cared, but… frankly, that does not work either. I was assigned female at birth; that means I was born, and a doctor visually inspected me, and declared “female”. And yet I most certainly do not have a fallopian tube pregnancy now, because I had my tubes surgerically removed, which also sterilised me. I’m as likely as the dude next door to have a fallopian tube pregnancy now. An inter person assigned female at birth may also be dead certain they do not have an ectopian pregnancy, because their visual inspection at birth actually misjudged their genes and organs quite a bit. I wondered what I would have liked the website writers to use instead. And the more I thought about it, I th
1
Wiki/Tag Page Edits and Discussion

Tuesday, June 13th 2023
Tue, Jun 13th 2023

Shortform
5Yitz3d
Does anyone here know of (or would be willing to offer) funding for creating experimental visualization tools? I’ve been working on a program which I think has a lot of potential, but it’s the sort of thing where I expect it to be most powerful in the context of “accidental” discoveries made while playing with it (see e.g. early use of the microscope, etc.).
1
2Prometheus3d
The following is a conversation between myself in 2022, and a newer version of me earlier this year. On the Nature of Intelligence and its "True Name": 2022 Me:  This has become less obvious to me as I’ve tried to gain a better understanding of what general intelligence is. Until recently, I always made the assumption that intelligence and agency were the same thing. But General Intelligence, or G, might not be agentic. Agents that behave like RLs may only be narrow forms of intelligence, without generalizability. G might be something closer to a simulator. From my very naive perception of neuroscience, it could be that we (our intelligence) is not agentic, but just simulates agents. In this situation, the prefrontal cortex not only runs simulations to predict its next sensory input, but might also run simulations to predict inputs from other parts of the brain. In this scenario, “desire” or “goals”, might be simulations to better predict narrowly-intelligent agentic optimizers. Though the simulator might be myopic, I think this prediction model allows for non-myopic behavior, in a similar way GPT has non-myopic behavior, despite only trying to predict the next token (it has an understanding of where a future word “should” be within the context of a sentence, paragraph, or story). I think this model of G allows for the appearance of intelligent goal-seeking behavior, long-term planning, and self-awareness. I have yet to find another model for G that allows for all three. The True Name of G might be Algorithm Optimized To Reduce Predictive Loss. 2023 Me: interesting, me’22, but let me ask you something: you seem to think this majestic ‘G’ is something humans have, but other species do not, and then name the True Name of ‘G’ to be Algorithm Optimized To Reduce Predictive Loss. Do you *really* think other animals don’t do this? How long is a cat going to survive if it can’t predict where it’s going to land? Or where the mouse’s path trajectory is heading? Did you th
2devansh3d
(I promised I'd publish this last night no matter what state it was in, and then didn't get very far before the deadline. I will go back and edit and improve it later.)   I feel like I keep, over and over, hearing a complaint from people who get most of their information about college admissions from WhatsApp groups or their parents’ friends or a certain extraordinarily pervasive subreddit (you all know what I’m talking about). Something like “College admissions is ridiculous! Look at this person, who was top of his math class and took 10 AP classes and started lots of clubs, he didn’t get into a single Ivy, he’s going to UCLA!” I think the closest allegory I can find for this is something like “look at this guy, he’s 7 feet tall, didn’t even make it to the NBA!” There’s something important that they’re both missing, some fundamental confusion of a tiny part of the overall metric from reality.
2riceissa3d
I used to have a model of breathing that went something like this: when breathing in, the lungs somehow get bigger, creating lower air pressure inside the lungs causing air to flow in. Then when breathing out the lungs get smaller, creating higher air pressure inside the lungs and causing air to flow out. How do the lungs get bigger and smaller? Eventually I learned that there's a muscle called the diaphragm that is attached to the bottom of the lungs (??) that pulls or pushes the lungs. If I keep my nose plugged but my mouth open, the air will travel through my mouth. If I keep my mouth closed but my nose open, the air will travel through my nostrils. So far, so good. Then a few days ago, I noticed that if I keep both my nose and mouth open, I could choose to breathe in solely through one or the other. This... doesn't make sense, according to the model. The model would predict that the air just flows through both pathways, maybe preferentially going through the mouth since that seems like the larger pathway. So something is clearly wrong with how I think about breathing. Is there some sort of further switch inside that blocks one of the pathways? Does the nose or the mouth contain variable-size cavities that can control air pressure to direct the flow? I still have no idea. I'm eventually going to look it up, but I might think about this for a little bit longer (or maybe someone here will tell me). I thought this was a pretty interesting example of how the explanations you hear about seemingly-basic things are easy to accept but don't make sense on further reflection. But it's hard to notice the flaw too. In my case, after a recent ENT visit where I was told my nasal passages are inflamed, I've been putting more effort into consciously breathing through my nose. Then one day I woke up and as soon as I woke up I did something like consciously breathe through my nose with mouth closed, and then somehow I opened my mouth but then still tried to breathe through my n
3
Wiki/Tag Page Edits and Discussion

Monday, June 12th 2023
Mon, Jun 12th 2023

Frontpage Posts
Shortform
21Czynski4d
This got deleted from 'The Dictatorship Problem [https://www.lesswrong.com/posts/pFaLqTHqBtAYfzAgx/the-dictatorship-problem]', which is catastrophically anxietybrained, so here's the comment: This is based in anxiety, not logic or facts. It's an extraordinarily weak argument. There's no evidence presented here which suggests rich Western countries are backsliding. Even the examples in Germany don't have anything worse than the US GOP produced ca. 2010. (And Germany is, due to their heavy censorship, worse at resisting fascist ideology than anyone with free speech, because you can't actually have those arguments in public.) If you want to present this case, take all those statistics and do economic breakdowns, e.g. by deciles of per-capita GDP. I expect you'll find that, for example, the Freedom House numbers show a substantial drop in 'Free' in the 40%-70% range and essentially no drop in 80%-100%. Of the seven points given for the US, all are a mix of maximally-anxious interpretation and facts presented misleadingly. These are all arguments where the bottom line ("Be Afraid") has been written first; none of this is reasonable unbiased inference. The case that mild fascism could be pretty bad is basically valid, I guess, but without the actual reason to believe that's likely, it's irrelevant, so it's mostly just misleading to dwell on it. Going back to the US points, because this is where the underlying anxiety prior is most visible: Interpretation, not fact. We're still in early enough stages that the reality of Biden is being compared to an idealized version of Trump - the race isn't in full swing yet and won't be for a while. Check back in October when we see how the primary is shaping up and people are starting to pay attention. This has been true for a while. Also, in assessing the consequences, it's assuming that Trump will win, which is correlated but far from guaranteed. Premise is a fact, conclusion is interpretation, and not at all a reliable one.
1
9mako yass4d
There's something very creepy to me about the part of research consent forms where it says "my participation was entirely voluntary." 1. Do they really think an involuntary participant wouldn't sign that? If they understand that they would, what purpose could this possibly serve, other than, as is commonly the purpose of contracts; absolving themselves of blame and moving blame to the participant? Which would be downright monstrous. Probably they just aren't fucking consequentialists, but this is all they end up doing. 2. This is a minor thing, but it adds an additional creepy garnish: Nothing is 100% voluntary, because everything is a function of the involuntary base reality that other people command force and resources and we want to use them for things so we have to go along with what other people want to some extent. I'm at peace with this, and I would prefer not to have to keep denying it, and it feels like I'm being asked to participate in the addling of moral philosophy.
5
3Johannes C. Mayer4d
I have a heuristic to evaluate topics to potentially write about where I especially look for topics to write about that usually people are averse to writing about. It seems that topics that score high according to this heuristic might be good to write about as they can yield content with high utility compared to what is available, simply because other content of this kind (and especially good content of this kind) is rare. Somebody told me that they read some of my writing and liked it. They said that they liked how honest it was. Perhaps writing about topics that are selected with this heuristic tends to invoke that feeling of honesty. Maybe just by being about something that people normally don't like to be honest about, or talk about at all. That might at least be part of the reason.
2lc4d
"No need to invoke slippery slope fallacies, here. Let's just consider the Czechoslovakian question in of itself" - Adolf Hitler
1James Spencer4d
WILL INTERNATIONAL AI ALIGNMENT COOPERATION TRUMP THE RIGHTS OF WEAKER COUNTRIES? TLDR - REAL COOPERATION ON INTERNATIONAL AI REGULATION MAY ONLY BE POSSIBLE THROUGH A MUCH MORE PEACEFUL BUT UNSENTIMENTAL FOREIGN POLICY  In 1987 President Reagan said to the United Nations "how quickly our differences worldwide would vanish if we were facing an alien threat from outside this world."  Isn't an unaligned Artificial General Intelligence that alien threat? And it's easy - and perhaps overly obvious and comforting - to say that humanity would unite, but now we have this threat what would that unity look like? Here's one not necessarily comforting thought, the weak (nations) will get trampled further by the strong (nations).  If cooperation rather than competition among power is vital then wouldn't we need to prioritise keeping powerful and potentially powerful countries - at least in AI terms - over other ideological concerns.  To see what this looks like let's look at some of those powerful countries: *  China - the obvious one, would we need to annoy the national security hawks over Taiwan, but also decent, humane liberals over Tibet and Sichuan?  * Russia - Ukraine would annoy just about everybody * Israel - Well this happens already because of domestic considerations, but it might reverse domestic political calculations on: * UK - the British are a big player in AI (and seemingly more important than the EU) so would needling them about Northern Ireland really be worth ticking off the one reliable ally the US has with clout? This is before looking at the role of countries that may be important in relation to AI and who the US wouldn't want going rogue on regulation but who neighbour China - such as Japan, South Korea and the chip superpower Taiwan.

Sunday, June 11th 2023
Sun, Jun 11th 2023

Shortform
25DirectedEvolution5d
Epistemic activism I think LW needs better language to talk about efforts to "change minds." Ideas like asymmetric weapons and the Dark Arts are useful but insufficient. In particular, I think there is a common scenario where: * You have an underlying commitment to open-minded updating and possess evidence or analysis that would update community beliefs in a particular direction. * You also perceive a coordination problem that inhibits this updating process for a reason that the mission or values of the group do not endorse. * Perhaps the outcome of the update would be a decline in power and status for high-status people. Perhaps updates in general can feel personally or professionally threatening to some people in the debate. Perhaps there's enough uncertainty in what the overall community believes that an information cascade has taken place. Perhaps the epistemic heuristics used by the community aren't compatible with the form of your evidence or analysis. * Solving this coordination problem to permit open-minded updating is difficult due to lack of understanding or resources, or by sabotage attempts. When solving the coordination problem would predictably lead to updating, then you are engaged in what I believe is an epistemically healthy effort to change minds. Let's call it epistemic activism for now. Here are some community touchstones I regard as forms of epistemic activism: * The founding of LessWrong and Effective Altruism * The one-sentence declaration on AI risks * The popularizing of terms like Dark Arts, asymmetric weapons, questionable research practices, and "importance hacking." * Founding AI safety research organizations and PhD programs to create a population of credible and credentialed AI safety experts; calls for AI safety researchers to publish in traditional academic journals so that their research can't be dismissed for not being subject to institutionalized peer review
1
2Dalcy Bremin5d
Why haven't mosquitos evolved to be less itchy? Is there just not enough selection pressure posed by humans yet? (yes probably) Or are they evolving towards that direction? (they of course already evolved towards being less itchy while biting, but not enough to make that lack-of-itch permanent) this is a request for help i've been trying and failing to catch this one for god knows how long plz halp tbh would be somewhat content coexisting with them (at the level of houseflies) as long as they evolved the itch and high-pitch noise away, modulo disease risk considerations.
4
1O O6d
A realistic takeover angle would be hacking into robots once we have them. We probably don’t want any way for robots to get over the air updates but it’s unlikely for this to be banned.

Saturday, June 10th 2023
Sat, Jun 10th 2023

Personal Blogposts
Shortform
4Dalcy Bremin6d
Having lived ~19 years, I can distinctly remember around 5~6 times when I explicitly noticed myself experiencing totally new qualia with my inner monologue going “oh wow! I didn't know this dimension of qualia was a thing.” examples: * hard-to-explain sense that my mind is expanding horizontally with fractal cube-like structures (think bismuth) forming around it and my subjective experience gliding along its surface which lasted for ~5 minutes after taking zolpidem for the first time to sleep (2 days ago) * getting drunk for the first time (half a year ago) * feeling absolutely euphoric after having a cool math insight (a year ago) * ... Reminds me of myself around a decade ago, completely incapable of understanding why my uncle smoked, being "huh? The smoke isn't even sweet, why would you want to do that?" Now that I have [addiction-to-X] as a clear dimension of qualia/experience solidified in myself, I can better model their subjective experiences although I've never smoked myself. Reminds me of the SSC classic [https://slatestarcodex.com/2014/03/17/what-universal-human-experiences-are-you-missing-without-realizing-it/]. Also one observation is that it feels like the rate at which I acquire these is getting faster, probably because of increase in self-awareness + increased option space as I reach adulthood (like being able to drink). Anyways, I think it’s really cool, and can’t wait for more.
2
4DirectedEvolution6d
Lightly edited for stylishness
1
3Dagon6d
I give some probability space to being a Boltzmann-like simulation.  It's possible that I exist only for an instant, experience one quantum of input/output, and then am destroyed (presumably after the extra-universal simulators have measured something about the simulation). This is the most minimal form of Solipsism that I have been configured to conceive.  It's also a fun variation of MWI (though not actually connected logically) if it's the case that the simulators are running multiple parallel copies of any given instant, with slightly different configurations and inputs.
3DirectedEvolution6d
I use ChatGPT as a starting point to investigate hypotheses to test at my biomedical engineering job on a daily basis. I am able to independently approach the level of understanding of specific problems of an experienced chemist with many years of experience on certain problems, although his familiarity with our chemical systems and education makes him faster to arrive at the same result. This is a lived example of the phenomenon in which AI improves the performance of the lower-tier performers more than the higher-tier performers (I am a recent MS grad, he is a post-postdoc). So far, I haven't been able to get ChatGPT to independently troubleshoot effectively or propose improvements. This seems to be partly because it struggles profoundly to grasp and hang onto the specific details I have provided to it. It's as if our specific issue is mixed with more the more general problems it has encountered in its training. Or as if, whereas in the real world, strong evidence is common [https://www.lesswrong.com/posts/JD7fwtRQ27yc8NoqS/strong-evidence-is-common], to ChatGPT, what I tell it is only weak evidence. And if you can't update strongly on evidence in my research world, you just can't make progress. The way I use it instead is to validate and build confidence in my conjectures, and as an incredibly sophisticated form of search. I can ask it how very specific systems we use in our research, not covered in any one resource, likely work. And I can ask it to explain how complex chemical interactions are likely behaving in specific buffer and heat conditions. Then I can ask it how adjusting these parameters might affect the behavior of the system. An iterated process like this combines ChatGPT's unlimited generalist knowledge with my extremely specific understanding of our specific system to achieve a concrete, testable hypothesis that I can bring to work after a couple of hours. It feels like a natural, stimulating process. But you do have to be smart enough to steer th
2JNS6d
I got my entire foundation torn down, and with it came everything else. It all came crashing down in one giant heap of rubble. I’ll just rebuild, I thought - not realizing you can’t build without a foundation plan. So all I’ve ended up doing was shift through the rubble, searching for things that feel right. Now I am back, in a very literal sense, to where I all began, so much was built, so many things destroyed and corrupted, and a major piece ended and got buried. And all I got is “what the eff am I doing here?” The obvious answer is “yelling at the sky demanding answers” and being utterly ignored. I guess as per usual it is all up to me, except I don’t know how to rebuild myself……again. F…..

Friday, June 9th 2023
Fri, Jun 9th 2023

Frontpage Posts
Shortform
3Dalcy Bremin7d
i absolutely hate bureaucracy, dumb forms, stupid websites etc. like, I almost had a literal breakdown trying to install Minecraft recently (and eventually failed). God.
1
3Quinn7d
"EV is measure times value" is a sufficiently load-bearing part of my worldview that if measure and value were correlated or at least one was a function of the other I would be very distressed. Like in a sense, is John [https://www.lesswrong.com/posts/voLHQgNncnjjgAPH7/utility-maximization-description-length-minimization] threatening to second-guess hundreds of years of consensus on is-ought?
3
3Stephen Fowler7d
Are humans aligned?  Bear with me!  Of course, I do not expect there is a single person browsing Short Forms who doesn't already have a well thought out answer to that question.  The straight forward (boring) interpretation of this question is "Are humans acting in a way that is moral or otherwise behaving like they obey a useful utility function." I don't think this question is particularly relevant to alignment. (But I do enjoy whipping out my best Rust Cohle impression [https://www.youtube.com/watch?v=Z5vwDfg3JNQ])  Sure, humans do bad stuff but almost every human manages to stumble along in a (mostly) coherent fashion. In this loose sense we are "aligned" to some higher level target, it just involves eating trash and reading your phone in bed. But I don't think this is a useful kind of alignment to build off of, and I don't think this is something we would want to replicate in an AGI. Human "alignment" is only being observed in an incredibly narrow domain. We notably don't have the ability to self modify and of course we are susceptible to wire-heading. Nothing about current humans should indicate to you that we would handle this extremely out of distribution shift well.  
1
3kuira7d
it's interesting that an intelligence in the 'original'/'top-level' universe also might [if simulation theory is valid] have evidence to assume it's close-to-certainly simulated maybe it would do acausal trade and precommit to not shutting down simulated intelligences
1Omega.7d
Quick updates:  * Our next critique (on Conjecture) will be published in 10 days.  * The critqiue after that will be on Anthropic. If you'd like to be a reviewer, or have critiques you'd like to share, please message us or email anonymouseaomega@gmail.com [anonymouseaomega@gmail.com]. * If you'd like to help edit our posts (incl. copy-editing - basic grammar etc, but also tone & structure suggestions and fact-checking/steel-manning), please email us! * We'd like to improve the pace of our publishing and think this is an area that external perspectives could help us * Make sure our content & tone is neutral & fair * Save us time so we can focus more on research and data gathering

Thursday, June 8th 2023
Thu, Jun 8th 2023

Frontpage Posts
Shortform
11Czynski8d
The 'new user' flag being applied to old users with low karma is condescending as fuck. I'm not a new user. I'm an old user who has spent most of my recent time on LW telling people things they don't want to hear. Well, most of the time I've actually spent posting weekly meetups, but other than that.
4
5Garrett Baker8d
Last night I had a horrible dream: That I had posted to LessWrong a post filled with useless & meaningless jargon without noticing what I was doing, then I went to slee, and when I woke up I found I had <−60 karma on the post. When I read the post myself I noticed how meaningless the jargon was, and I myself couldn't resist giving it a strong-downvote.
5DirectedEvolution8d
Over the last six months, I've grown more comfortable writing posts that I know will be downvoted. It's still frustrating. But I used to feel intensely anxious when it happened, and now, it's mostly just a mild annoyance. The more you're able to publish your independent observations, without worrying about whether others will disagree, the better it is for community epistemics.
1
3jacquesthibs8d
AI labs should be dedicating a lot more effort into using AI for cybersecurity as a way to prevent weights or insights from being stolen. Would be good for safety and it seems like it could be a pretty big cash cow too. If they have access to the best models (or specialized), it may be highly beneficial for them to plug them in immediately to help with cybersecurity (perhaps even including noticing suspicious activity from employees). I don’t know much about cybersecurity so I’d be curious to hear from someone who does.
3Quinn8d
messy, jotting down notes: * I saw this thread https://twitter.com/alexschbrt/status/1666114027305725953 [https://twitter.com/alexschbrt/status/1666114027305725953] which my housemate had been warning me about for years. * failure mode can be understood as trying to aristotle the problem, lack of experimentation * thinking about the nanotech ASI threat model, where it solves nanotech overnight and deploys adversarial proteins in all the bloodstreams of all the lifeforms. * These are sometimes justified by Drexler's inside view of boundary conditions and physical limits. * But to dodge the aristotle problem, there would have to be an amount of bandwidth of what's passing between sensors and actuators (which may roughly correspond to the number of do applications in pearl) * Can you use something like communication complexity https://en.wikipedia.org/wiki/Communication_complexity [https://en.wikipedia.org/wiki/Communication_complexity] (between a system and an environment) to think about "lower bound on the number of sensor-actuator actions" mixed with sample complexity (statistical learning theory) * Like ok if you're simulating all of physics you can aristotle nanotech, for a sufficient definition of "all" that you would run up against realizability problems and cost way more than you actually need to spend. Like I'm thinking if there's a kind of complexity theory of pearl (number of do applications needed to acquire some kind of "loss"), then you could direct that at something like "nanotech projects" to fermstimate the way AIs might tradeoff between applying aristotlean effort (observation and induction with no experiment) and spending sensor-actuator interactions (with the world). There's a scenario in the sequences if I recall correctly about which physics an AI infers from 3 frames of a video of an apple falling, and something about how security mindset suggests you shouldn't expect your information-theoret

Wednesday, June 7th 2023
Wed, Jun 7th 2023

Shortform
7Mitchell_Porter9d
Eliezer recently tweeted that most people can't think, even most people here [https://twitter.com/ESYudkowsky/status/1665165312247975937], but at least this is a place where some of the people who can think, can also meet each other [https://twitter.com/ESYudkowsky/status/1665439386089955330].  This inspired me to read Heidegger's 1954 book What is Called Thinking? [https://en.wikipedia.org/wiki/What_Is_Called_Thinking%3F] (pdf [https://www.sas.upenn.edu/~cavitch/pdf-library/Heidegger_What_Is_Called_Thinking.pdf]),  in which Heidegger also declares that despite everything, "we are still not thinking".  Of course, their reasons are somewhat different. Eliezer presumably means that most people can't think critically, or effectively, or something. For Heidegger, we're not thinking because we've forgotten about Being, and true thinking starts with Being.   Heidegger also writes, "Western logic finally becomes logistics, whose irresistible development has meanwhile brought forth the electronic brain." So of course I had to bring Bing into the discussion.  Bing told me what Heidegger would think of Yudkowsky [https://pastebin.com/XccznywE], then what Yudkowsky would think of Heidegger [https://pastebin.com/EeS9qMMg], and finally we had a more general discussion about Heidegger and deep learning [https://pastebin.com/LPryEh0E] (warning, contains a David Lynch spoiler). Bing introduced me to Yuk Hui [https://en.wikipedia.org/wiki/Yuk_Hui], a contemporary Heideggerian who started out as a computer scientist, so that was interesting.  But the most poignant moment came when I broached the idea that perhaps language models can even produce philosophical essays, without actually thinking. Bing defended its own sentience, and even creatively disputed the Lynchian metaphor, arguing that its "road of thought" is not a "lost highway", just a "different highway". (See part 17, line 254.) 
6O O9d
If alignment is difficult, it is likely inductively difficult (difficult regardless of your base intelligence), and ASI will be cautious of creating a misaligned successor or upgrading itself in a way that risks misalignment. You may argue it’s easier for an AI to upgrade itself, but if the process is hardware bound or even requires radical algorithmic changes, the ASI will need to create an aligned successor as preferences and values may not transfer directly to new architectures or hardwares. If alignment is easy we will likely solve it with superhuman narrow intelligences and aligned near peak human level AGIs. I think the first case is an argument against FOOM, unless the alignment problem is solvable but only at higher than human level intelligences (human meaning the intellectual prowess of the entire civilization equipped with narrow superhuman AI). That would be a strange but possible world.
1
4Writer9d
Rational Animations has a subreddit: https://www.reddit.com/r/RationalAnimations/ [https://www.reddit.com/r/RationalAnimations/] I hadn't advertised it until now because I had to find someone to help moderate it.  I want people here to be among the first to join since I expect having LessWrong users early on would help foster a good epistemic culture.
2lc9d
The greatest generation imo deserves their name, and we should be grateful to live on their political, military, and scientific achievements.
2O O9d
The fact that this was completely ignored is a little disappointing. This is a very important question that would help put upper bounds to value drift, but it seems that answering it limits the imagination when it comes to ASI. Has there ever been an answer to it? I have a feeling larger brains have a higher coordination problem between its subcomponents, especially when you hit information transfer limits. This would put some hard limits on how much you can scale intelligence but I may be wrong. A fermi estimate on the upper bounds of intelligence may eliminate some problem classes alignment arguments tend to include.
4
Wiki/Tag Page Edits and Discussion

Load More Days