All of JenniferRM's Comments + Replies

I think the utility function and probability framework from VNM rationality is a very important kernel of math that constrains "any possible agent that can act coherently (as a limiting case)".

((I don't think of the VNM stuff as the end of the story at all, but it is an onramp to a larger theory that you can motivate and teach in a lecture or three to a classroom. There's no time in the VNM framework. Kelly doesn't show up, and the tensions and pragmatic complexities of trying to apply either VNM or Kelly to the same human behavioral choices in real life a... (read more)

4jessicata5d
Thanks for the comment! I do think DACs are an important economics idea. This post details the main reason why I don't think they can raise a lot of money (compared with copyright etc) under most realistic conditions, where it's hard to identify lots of people who value the good at above some floor. AGI might have an easier time with this sort of thing through better predictions of agents' utility functions, and open-source agent code.

The intellectually hard part of Kant is coming up with deontic proofs for universalizable maxims in novel circumstances where the total list of relevant factors is large. Proof generation is NP-hard in the general case!

The relatively easy part is just making a list of all the persons and making sure there is an intent to never treat any of them purely as a means, but always also as an end in themselves. Its just a checklist basically. To verify that it applies to N people in a fully connected social graph is basically merely O(N^2) checks of directional bi... (read more)

I laughed out loud on this line...

Perhaps my experience in the famously kindly and generous finance industry has not prepared me for the cutthroat reality of nonprofit altruist organizations.

...and then I wondered if you've seen Margin Call? It is truly a work of art.

My experiences are mostly in startups, but rarely on the actual founding team, so I have seen more stuff that was unbuffered by kind, diligent, "clueless" bosses.

My general impression is that "systems and processes" go a long way into creating smooth rides for the people at the bottom, but tho... (read more)

With apologies for the long response... I suspect the board DID have governance power, but simply not decisive power.

Also it was probably declining, and this might have been a net positive way to spend what remained of it... or not?

It is hard to say, and I don't personally have the data I'd need to be very confident. "Being able to maintain a standard of morality for yourself even when you don't have all the data and can't properly even access all the data" is basically the core REASON for deontic morality, after all <3

Naive consequentialism has a huge ... (read more)

That's part of the real situation though. Sam would never quit to "spend more time with his family".

When we predict good outcomes for startups, the qualities that come up in the supporting arguments are toughness, adaptability, determination. Which means to the extent we're correct, those are the qualities you need to win.

Investors know this, at least unconsciously. The reason they like it when you don't need them is not simply that they like what they can't have, but because that quality is what makes founders succeed.

Sam Altman has it. You could parachut

... (read more)
5Eli Tyre14d
He can leave "to avoid a conflict of interest with AI development efforts at Tesla", then. It doesn't have to be "relaxation"-coded. Let him leave with dignity. Insofar as Sam would never cooperate with the board firing him at all, even more gracefully, then yeah, I guess the board never really had any governance power at all, and it's good that the fig leaf has been removed.  And basically, we'll never know if they bungled a coup that they could have pulled off if they were more savvy, or if this was a foregone conclusion.

I wrote a LOT of words in response to this, talking about personal professional experiences that are not something I coherently understand myself as having a duty (or timeless permission?) to share, so I have reduced my response to something shorter and more general. (Applying my own logic to my own words, in realtime!)

There are many cases (arguably stupid cases or counter-producive cases, but cases) that come up more and more when deals and laws and contracts become highly entangling.

Its illegal to "simply" ask people for money in exchange for giving them... (read more)

When I read this part of the letter, the authors seem to be throwing it in the face of the board like it is a damning accusation, but actually, as I read it, it seems very prudent and speaks well for the board.

You also informed the leadership team that allowing the company to be destroyed “would be consistent with the mission.”

Maybe I'm missing some context, but wouldn't it be better for Open AI as an organized entity to be destroyed than for it to exist right up to the point where all humans are destroyed by an AGI that is neither benevolent nor "aligned ... (read more)

I agree with all of this in principal, but I am hung up on the fact that it is so opaque. Up until now the board have determinedly remained opaque.

If corporate seppuku is on the table, why not be transparent? How does being opaque serve the mission?

5xpym15d
This seems to presuppose that there is a strong causal effect from OpenAI's destruction to avoiding creation of an omnicidal AGI, which doesn't seem likely? The real question is whether OpenAI was, on the margin, a worse front-runner than its closest competitors, which is plausible, but then the board should have made that case loudly and clearly, because, entirely predictably, their silence has just made the situation worse.

Maybe I'm missing some context, but wouldn't it be better for Open AI as an organized entity to be destroyed than for it to exist right up to the point where all humans are destroyed by an AGI that is neither benevolent nor "aligned with humanity" (if we are somehow so objectively bad as to deserve care by a benevolent powerful and very smart entity).

The problem I suspect is that people just can't get out of the typical "FOR THE SHAREHOLDERS" mindset, so a company that is literally willing to commit suicide rather than getting hijacked for purposes anti... (read more)

This is a diagram explaining what is, in some sense, the fundamental energetic numerical model that explains "how life is possible at all" despite the 2nd law:

A picture of two activation energies, with one (requiring less activation energy) cattalyzed by an enzyme and the other not, and therefore taking more activation energy. The reactants are simply the combustion of glucose+O2 into CO2 and water.

The key idea is, of course, activation energy (and the wiki article on the idea is the source of the image).

If you take "the focus on enzymes" and also the "background of AI" seriously, then the thing that you might predict would happen is a transition on Earth from a regime where "DNA programs coordinate protein enzymes in a way that was haphazardly 'designed' by naturalistic evolution" to a regime ... (read more)

5JohnBuridan1mo
Since we are at no moment capable of seeing all that is inefficient and wasteful, and constantly discover new methods of wealth creation, we are at each moment liable to be accused of being horribly wasteful compared to our potential, no? There is no way to stand up against that accusation.

I've thought about this for a bit, and I think that the constitution imposes many constraints on the shape and constituting elements of the House that aren't anywhere close to optimal, and the best thing would be to try to apply lots and lots of mechanism design and political science but only to the House (which is supposed to catch the passions of the people and temper them into something that might include more reflection).

A really bad outcome would be to make a change using some keyword from election theory poorly, and then have it fail, and then cause ... (read more)

Your summary did not contain the keyword "unlearning" which suggested that maybe he people involved didn't know about how Hopfield Networks form spurious memories by default that need to be unlearned. However, article you linked mentions "unlearn" 10 times so my assumption is that they are aware of this background and re-used the jargon on purpose.

So the way humans solve that problem is (1) intellectual humility plus (2) balance of power.

For that first one, you aim for intellectual humility by applying engineering tolerances (and the extended agentic form of engineering tolerances: security mindset) to systems and to the reasoner's actions themselves. 

Extra metal in the bridge. Extra evidence in the court trial. Extra jurors in the jury. More keys in the multisig sign-in. Etc.

(All human institutions are dumpster fires by default, but if they weren't then we would be optimizing the value of info... (read more)

Assuming we have a real uh... real "agent agent" (like a thing which has beliefs for sane reasons and plans and acts in coherently explicable ways and so on) then I think it might just be Correct Behavior for some extreme versions of "The Shutdown Problem" to be mathematically impossible to "always get right".

Fundamentally: because sometimes the person trying to turn the machine off WILL BE WRONG.

...

Like on Petrov Day, we celebrate a guy whose job was to press a button, and then he didn't press the button... and THAT WAS GOOD.

Petrov had Official Evidence t... (read more)

7Dweomite1mo
I don't think anyone is saying that "always let the human shut you down" is the Actual Best Option in literally 100% of possible scenarios. Rather, it's being suggested that it's worth sacrificing the AI's value in the scenarios where it would be correct to defend itself from being shut off, in order to be able to shut it down in scenarios where it's gone haywire and it thinks it's correct to defend itself but it's actually destroying the world.  Because the second class of scenarios seems more important to get right.

In the setup of the question you caused my type checker to crash and so I'm not giving an answer to the math itself so much as talking about the choices I think you might need to make to get the question to type check for me...

Here is a the main offending bit:

So I... attach beliefs to statements of the form  "my initial degree of belief is represented with probability density function ." 

Well this is not quite possible since the set of all such  is uncountable. However something similar to the probability density trick

... (read more)

Neat!

Figure 1 from ASSESSMENT OF SYNCHRONY IN MULTIPLE NEURAL SPIKE TRAINS USING  LOGLINEAR POINT PROCESS MODELS. | Semantic Scholar

 

The above is figure 1 from the 2011 paper "Assessment of synchrony in multiple neural spike trains using loglinear point process models".

The caption for the figure is:

Neural spike train raster plots for repeated presentations of a drifting sine wave grating stimulus. (A) Single cell responses to 120 repeats of a 10 second movie. At the top is a raster corresponding to the spike times, and below is a peri-stimulus time histogram (PSTH) for the same data. Portions of the stimulus eliciting firing are apparent. (B) The same plots as in (A), for a d

... (read more)
3Sergii2mo
art imitating life ) also reminds me a bit of "the matrix" green screens but I did not find a nice green colormap to make it more similar: https://media.wired.com/photos/5ca648a330f00e47fd82ae77/master/w_1920,c_limit/Culture_Matrix_Code_corridor.jpg  

This might be why people start companies after being roommates with each other. The "group housing for rationalists" thing wasn't chosen by accident back in ~2009.

Concretely: I wish either or both of us could get some formal responses instead of just the "voting to disagree".

 

In Terms Of Sociological Abstractions: Logically, I understand some good reasons for having "position voting" separated from "epistemic voting" but I almost never bother with the later since all I would do with it is downvote long interesting things and upvote short things full of math.

But I LIKE LONG INTERESTING THINGS because those are where the real action (learning, teaching, improving one's ontologies, vibing, motivational stuff, fact... (read more)

So this caught my eye:

If you believe that the only path to compute governance is a surveillance state, and you are accelerating AI and thus when we will need and when we will think we need such governance, what are the possibilities?

I'm somewhat sympathetic to "simply ban computers, period" where you don't even need a "total surveillance state", just the ability to notice fabs and datacenters and send cease and desist orders (with democratically elected lawful violence backing such orders).

Like if you think aligning AI to humanistic omnibenevolence is basi... (read more)

2Noosphere893mo
Something like this would definitely be my reasoning. In general, a big disagreement that seems to animate me compared to a lot of doomers like Zvi or Eliezer Yudkowsky is that to a large extent, I think the AI accident safety problem by default will be solved, either by making AIs shutdownable, or by aligning them, or by making them corrigible, and I see pretty huge progress that others don't see. I also see the lack of good predictions from a lot of doomy sources, especially MIRI beyond doom that have panned out as another red flag, because they imply that there isn't much reason to trust that their world-models, especially the most doomy ones have any relation to reality Thus, I'm much more concerned about outcomes where we successfully align AI, but something like The Benevolence of the Butcher scenario happens, where the state and/or capitalists mostly control AI, and very bad things happen because the assumptions that held up industrial society crumble away. Very critically, one key difference between this scenario and common AI risk scenarios is that it makes a lot of anti-open source AI movements look quite worrisome, and AI governance interventions can and arguably will backfire. https://www.lesswrong.com/posts/2ujT9renJwdrcBqcE/the-benevolence-of-the-butcher

A bold move! I admire it the epistemology of it, and your willingness to back it with money! <3

Importing some very early comments from YouTube, which I do not endorse (I'd have to think longer), but which are perhaps interesting for documenting history, and tracking influence campaigns and (/me shrugs) who knows what else?? (Sorted to list upvotes and then recency higher.)

@Fiolsthu95 3 hours ago +2

I didn't ever think I'd say this but.. based Trump?!?

@henrysleight7768 1 hour ago +1

"What Everyone in Technical Alignment is Doing and Why" could literally never 

@scottbanana1 3 hours ago +1

The best content on YouTube

@anishupadhayay3917 14 minutes ago... (read more)

Here I'm going to restrict myself to defending my charitable misinterpretation of trevor's claim and ignore the FDA stuff and focus on the way that the Internet Of Things (IoT) is insecure.

I. Bluetooth Headsets (And Phones In General) Are Also Problematic

I do NOT have "a pair of Bluetooth headphones, which I use constantly".

I rarely put speakers in my ears, and try to consciously monitor sound levels when I do, because I don't expect it to have been subject to long term side effect studies or be safe by default, and I'd prefer to keep my hearing and avoid ... (read more)

2trevor4mo
I'm pretty surprised at how far this went, JenniferRM covered a surprisingly large proportion of the issue (although there's a lot of tangents e.g. the FDA, etc so it also covered a lot of stuff in general). I'd say more, but I already said exactly as much as I was willing to say on the matter, and people inferred information all the way up to the upper limit of what I was willing to risk people inferring from that comment, so now I'm not really willing to risk saying much more. Have you heard about how CPUs might be reprogrammed to emit magnetic frequencies that transmit information through faraday cages and airgaps, and do you know if a similar process can turn a wide variety of chips into microphones via using the physical CPU/ram space as a magnetometer? I don't know how to verify any this, since intelligence agencies love to make up stuff like this in the hopes of disrupting enemy agencies counterintelligence departments. I'm not really sure how tractable this is for Elizabeth to worry about, especially since the device ultimately was recommended against, and anyway it's Elizabeth seems to be more about high-EV experiments, rather than defending the AIS community from external threats. If the risk of mind-hacking or group-mind-hacking is interesting, a tractable project would be doing a study on EA-adjacents to see what happens if they completely quit social media and videos/shows cold-turkey, and only read books and use phones for 1-1 communication with friends during their leisure time. Modern entertainment media, by default, is engineered to surreptitiously steer people towards time-mismanagement. Maybe replace those hours with reading EA or rationalist texts. It's definitely worth studying as the results could be consistent massive self-improvement, but it would be hard to get a large representative sample of people who are heavily invested in/attached to social media (i.e. the most relevant demographic).

I was curious about the hypothetical mechanism of action here!

I hunted until I found a wiki page, and then I hunted until I found a citation, and the place I landed as "probably the best way to learn about this" was a podcast!

SelfHacked Radio, Dec 19, 2019, "Microdosing with Dr. David Rabin" (53 minutes)

[Intro:] Today, I’m here with Dr. David Rabin, who is a psychiatrist and neuroscientist. 

We discuss PTSD, psychedelics and their mechanisms, and the different drugs being used for microdosing.

I have not listened to the podcast, but this wiki article cites... (read more)

4Elizabeth4mo
we talked about this a little here.

If I was going to try to charitably misinterpret trevor, I'd suggest that maybe he is remembering that "the S in 'IoT' stands for Security"

(The reader stops and notices: I-O-T doesn't contain an S... yes! ...just like such devices are almost never secure.) So this particular website may have people who are centrally relevant to AI strategy, and getting them all to wear the same insecure piece of hardware lowers the cost to get a high quality attack? 

So for anyone on this site who considers themselves to be an independent source of world-saving ... (read more)

4gjm4mo
I think your "charitable misinterpretation" is pretty much what trevor is saying: he's concerned that LW users might become targets for some sort of attack by well-resourced entities (something something military-industrial complex something something GPUs something something AI), and that if multiple LW users are using the same presumably-insecure device that might somehow be induced to damage their health then that's a serious risk. See e.g. https://www.lesswrong.com/posts/pfL6sAjMfRsZjyjsZ/some-basics-of-the-hypercompetence-theory-of-government ("trying to slow the rate of progress risks making you an enemy of the entire AI industry", "trying to impeding the government and military's top R&D priorities is basically hitting the problem with a sledgehammer. And it can hit back, many orders of magnitude harder"). I'm not sure exactly what FDA approval would entail, but my guess is that it doesn't involve the sort of security auditing that would be necessary to allay such concerns.
6Elizabeth4mo
  I recognize that unknown unknowns are part of the problem so am not insisting anyone prove a particular deadly threat. But I struggle to figure out how a vibrating bracelet has more attack surface than a pair of Bluetooth headphones, which I use constantly. 

Pretty cool! I did the first puzzle, and then got to the login, and noped out. Please let me and other users set up an email account and password! As a matter of principle I don't outsource my logins to central points of identarian failure.

9Brendan Long4mo
I'd also like to add that having a more normal 3rd party login would help if you want non-programmers to use this (Facebook login is what I'd recommend).
6hijohnnylin4mo
Hi Jennifer, Thanks for participating - my apologies for only having GitHub login at the moment. Please feel free to create a throwaway Github account if you'd still like to play (I think Github allows you to use disposable emails to sign up - I had no problem creating an account using an iCloud disposable email). Email/password login is definitely on the TODO.

I see there as being (at least) two potential drivers in your characterization, that seem to me like they would suggest very different plans for a time traveling intervention. 

Here's a thought experiment: you're going to travel back in time and land near Gnaeus Pompeius Magnus, who you know will (along with Marcus Licinius Crassus) repeal the constitutional reforms of Sulla (which occurred in roughly 82-80 BC and were repealed by roughly 70BC).

Your experimental manipulation is to visit the same timeline twice and either (1) hang out nearby and help dr... (read more)

I apologize! Is there anything (1) I can afford that (2) might make up for my share of the causality in the harm you experienced (less my net causal share of benefits)?

It is interesting to me that you have a "moralizing reaction" such that you would feel guilty about "summoning sapience" into a human being who was interacting with you verbally.

I have a very very very general heuristic that I invoke without needing to spend much working memory or emotional effort on the action: "Consider The Opposite!" (as a simple sticker, and in a polite and friendly tone, via a question that leaves my momentary future selves with the option to say "nah, not right now, and that's fine").

So a seemingly natural thing that occurs to me is ... (read more)

I am struck by the juxtaposition between: calling the thing "sapience" (which I currently use to denote the capacity for reason and moral sentiment, and which I think of as fundamentally connected to the ability to negotiate in words) and the story about how you were sleep walking through a conversation (and then woke up during the conversation when asked "Can you speak more plainly?").

Naively, I'd think that "sapience" is always on during communication, and yet, introspecting, I do see that some exchanges of words have more mental aliveness to them than o... (read more)

2Valentine5mo
Oh, that's a neat observation! I hadn't noticed that. Minor correction to the story: She asked me if I'd be okay with her speaking frankly. Your read might not change the example but I'm not sure. I don't think it affects your point!   Gosh. Not really? I can invent some. My first reaction is "That's none of my business." Why do I need them to summon sapience? Am I trying to make them more useful to me somehow? That sure seems rude. But maybe I want real connection and want them to pop out of autopilot? But that seems rude to sort of inflict on them. But maybe I can invite them into it…? This seems way clearer in a long-term relationship (romantic or otherwise) where both people are explicitly aware of this possibility and want each other's support. I'd love to have kids, and the mother of my children needs to be capable of extraordinary levels of sanity, but neither she nor I are going to be lucid all the times when it's a good idea. I could imagine us having a kind of escape routine installed between us, kind of like a "time out" sign, that means "Hold up, I call for a pause, there's some fishy autopilot thing going on here and I need us to reorient." That version I have with a few friends. That seems just great. Some of my clients seem to want me to gently, slowly invite them into implementing more of these sapient algorithms. I don't usually think of it that way. I also don't install these algorithms for them. I more point out how they could and why they might want to, and invite them to do it themselves if they wish. That's off the top of my head.
5dr_s5mo
It says something interesting about LLMs because really sometimes we do the exact same thing, just generating plausible text based on vibes rather than intentionally communicating anything.

The above post is part of a sequence of three, but only mentions that in the prologue at the top. I comment here to make the links easier to find for people who are  maybe kinda deadscrolling but want to "find the next thing".

However also, do please consider waking up and thinking about how and why you're reading this before clicking further! There is a transition from "observing" to "orienting" in an "OODA" loop, where you shift from accepting fully general input from the largest contexts to having a desire to see something specific that would answer... (read more)

I often skip footnotes, but looking at those two gorgeous videos, I'm reminded of both the central truth of nature, and the contending factor that I find it aesthetic even despite understanding it! <3

A climbing plant sends out questing flailing tendrils, then finds a branch, executes coiling and growth, and sends out new tendrils

I just want to say that this image of "plant deliberation" was awesome, and made things click in a way that they hadn't, for me, before seeing it (and then reading the text that it was paired with). I love the little question marks, and the "!" when something useful is found by one of the "speculative lines of growth".

3Oliver Sourbut5mo
Great, I'm glad my silly sketches helped! I've had other feedback that these pics have been useful, actually. Did you spot the footnote with links to some timelapse videos of climbing plants?

Apologies for TWO comments (here's the other), but there are TWO posts here! I'm justified I think <3

I slip a lot, but when I'm being "careful and good in my speech" I distinguish between persons, and conscious processes, and human beings.

A zygote, in my careful language, is a technical non-central human being, but certainly not a person, and (unless cellular metabolism turns out to have "extended inclusive sentience") probably not "conscious".

...

I. Something I think you didn't bring up, that feels important to me, is that the concept of "all those able... (read more)

I tried to attribute each theory to some "philosophy hero", then I used Critch's N counts and Huffman Encoded thusly:

"0" = "Buddha" 
"10" = "Metzinger" 
"1100" = "Descartes" 
"1101" = "Heidegger" 
"11110" = "Pollock" 
"111110" = "Nagel" 
"111111" = "Hume"

This is NOT a unique Huffman Encoding (the 2s can be hot-swapped, which would recluster things):

Buddha(14)-----------------------------------------------------------------0|--"All!"
Metzinger(7.5)-----------------------------------------0|--"Science"(10.5)-1| 
Descartes(4)-0|--"Cont
... (read more)

A random stupid thought that occurs to me is that maybe your limbic system might be set to be too trusting of the truths you have "already accepted", and then maybe something else in your limbic system has been hurt enough to feel like "actions based on beliefs get me hurt" and so it has shut down that whole category of "theoretically motivated actions"?

Naively, two such mechanisms hiding in your limbic system would, together, perhaps create the totality of behavior and mindset that you describe?

There is a sequence of posts on babbling and pruning that is ... (read more)

3TeaTieAndHat5mo
I haven’t yet read the posts you suggest, but your answer seems really convincing. I guess ‘knowing stuff’, ‘sitting there doing nothing but reading popular science books’, were more rewarded than ‘actually being smart/actually taking the risk of possibly being wrong’, so I did mostly the former, at least as a child and young student. And, weird as it sounds, I guess my first years of uni did the same thing: what was really rewarded then was knowing as many cool examples to put in an essay as possible, while ‘rationality’, ‘scientific evidence’, etc. were kind of discouraged. Looking back at it, I see that there was some pruning in my interests, in the kind of books I’d read, etc. Might have been a similar pruning of what my brain lets me do, for all I know. Seems really odd, as it would mean I’ve been sort of traumatized by my early college years? But consistent with what I’ve sometimes been ranting about. Will look into all of that. Thanks a lot for your very interesting comment, by the way!

I beg the tolerance of anyone who sees these two very long comments.

I personally found it useful to learn "yet another of my interlocutors who seems to be opposed to AI regulations has just turned out to just be basically an anarchist at heart".

Also, Shankar and I have started DMing a bunch, to look for cruxes, because I really want to figure out how Anarchist Souls work, and he's willing to seek common epistemic ground, and so hopefully I'll be able to learn something in private, and me and Shankar can do some "adversarial collaboration" (or whatever), an... (read more)

The state is largely run by people who seek power and fame. That is importantly different from most of us.

When you say "I do not blame a slave for his submission" regarding Daylight Savings Time, that totally works in the second frame where "l'état ce ne sont que des bureaucrates humains". 

You're identifying me, and you, and all the other "slaves" as "mere transistors in society".

I dunno about you, but I grew up in a small (unincorporated) town, that was run by the Chamber of Commerce and Rotary Club. My first check as pay for labor was as a soccer referee when I was 12, reffing seven year olds. There was a Deputy of the County Sheriff, but he was not the... (read more)

7Shankar Sivarajan5mo
I read your perspective that you've elaborated on at some considerable length, and it's more than a little frustrating that you get so close to understanding mine, that you describe what seems like a perfectly reasonable model of my views, and then go "surely not," so I shall be a little terse; that should leave less room for well-meaning misinterpretation. Yes. I wouldn't. I didn't believe it until only recently (I used to be a minarchist), so I see just how difficult it is to show this. Also, "overthrow the evil government" sounds like "just switch off the misaligned AGI." There isn't. We should. I don't. (Well, okay, maybe the county sheriffs of some small towns. But nobody relevant.) If you insist that I complete this French sentence you keep riffing on, "l'etat c'est Moloch." This is an eminently reasonable question I ask myself everyday: call it cowardice, but I don't think it will accomplish anything, and I'd rather be alive than dead.   As a side note, this fundamental misunderstanding reminds of how Bioshock became so beloved by libertarians: the intended response to the "no gods or kings, only man" message was "Ah, I see how good and necessary the state is," not "No king, no king, la la la la la la."

"L'état c'est nous" though? (The state, it is us.)

I'm pretty sure I am not an eldritch horror and I suspect you aren't either, Shankar! Does the "eldritch horror part" arises from our composition? Is so, why and how? Maybe it is an aspect of humans that emerges somehow from a large number of humans?

"L'état ce ne sont que des bureaucrates humains" is another possibility (the state, it is merely some human bureaucrats) who I fully admit might not be saints, and might not be geniuses, but at least they are humans, operating at human speeds, with human common ... (read more)

Calling it "eldritch" is mere rhetorical flourish to evoke Lovecraft; of course it's not literally paranormal.

Asking which individual is responsible for the evil of the state is like asking which transistor in the AGI is misaligned. That kind of reductionism obviously won't get you a reasonable answer, and you know it. 

The problem is the incorrigibility of the system; it's the same idea as Scott Alexander's Meditations on Moloch: ultimately, it's a coordination problem that giving a name to helps reify in the human mind. In this context, I like quotin... (read more)

Or a pre-school or kindergym or (if a building design is opulent enough to offer room-specific temperature control) a two-year-old's bedroom?

Small bodies have much higher surface area to volume ratios, and a 10 month old can barely even explain the problem they face!

In grocery stores when I was really little, I'd stay "just outside" the cold aisle, and then run to the other end to try to avoid the chill, when along with parents on a shopping trip who wanted to loiter in the middle of it. It was only much later that I understood the physics of why they weren't bothered, and the psycho-politics of why no one optimized that stuff "for me".

4mako yass5mo
I just don't remember ever minding cold as a child. Yes, I would run through the freezer isle, or jump around to keep my feet off the ground and to stay warm, but I would enjoy the running, it would fully address the problem for me. With pre-school kids, I feel like they're always moving around a lot in this way, but I'm not sure.

Oooh! High agreement on something this downvoted is curiosity catnip!

(Currently I see -18 for position, and +7 for agreement... I haven't touched either button, but I'll definitely upvote a response to my questions here <3)

I thought "this is nice" would be a common human reaction, but apparently I'm miscalibrated?

The "agreement votes" suggest that even people who think you're being mean kinda grudgingly admit that you're saying something accurate...

...but like... What? 

Don't "normal people" also like in a basic public space (that isn't a museum or ... (read more)

3Ericf5mo
Neurotypicals have weaker preferences regarding textures and other sensory inputs. By and large, they would not write, read, or expect others to be interested in a blow-by-blow of asthetics. Also, at a meta level, the very act of writing down specifics about a thing is not neurotypical. Contrast this post with the equivalent presentation in a mainstream magazine. The same content would be covered via pictures, feeling words, and generalities, with specific products listed in a footnote or caption, if at all. Or consider what your neurotypical friend's Facebook post about a renovation/new house etc. Emphasize: typically it's the people, as in "we just bought a house. I love the wide open floor plan, and the big windows looking out over the yard make me so happy" in contrast to "we find that the residents are happier and more productive with 1000W of light instead of the typical 200." #don'texplainthejoke

I think offices are kept at a "low" temperature because there is actually wide variation in temperature preferences and tolerances among normal humans, and maybe also because it is considered easier for women and skinny people to add a sweater than for others to change gender, lose weight, or wear ice packs.

I think I approve of this for spaces that aren't going to have kids, but I think that for kid-centric spaces a higher temperature than is maximally comfy for large men is still correct? Maybe? 

(Or you could try to maintain gradients and zones? I've... (read more)

4mako yass5mo
Oh you mean like, 6-13 year old kids? Yeah maybe.

This is a great guide! I hit ^f[music] and don't see any hits, so I'll add that when I visited the Lightcone office, I was talking about something, and there was nice music in the background, and then I just had to interrupt myself, point at the speakers, and say "are we in the tropical village from breathe of the wild?... I think I love it!" and then we went back to chatting about <topic> after the nod and smile in response.

I'm not sure how standard this is, or what tools were used, but just adding this soundtrack (and stuff like it?) to the overall ambiance of the visuals and so on was quite nice :-)

"Early in the Reticulum[Internet] -thousands of years ago— it became almost useless because it was cluttered with faulty, obsolete, or downright misleading information," Sammann said. 

"Crap, you once called it," I reminded him.

"Yes-a technical term..."

...

"As a tactic for planting misinformation in the enemy’s reticules[webpages/webservers], you mean," Osa said. "This I know about. You are referring to the Artificial Inanity programs of the mid–First Millennium A.R." 

Source: Anathem (2009) transcription via Redditors fanning about it.

See also: bog... (read more)

I cannot quickly find a clean "smoking gun" source nor well summarized defense of exactly my thesis by someone else.

(Neither Google nor the Internet seem to be as good as they used to be, so I no longer take "can't find it on the Internet with Google" as particularly strong evidence that no one else has had the idea and tested and explored it in a high quality way that I can find and rely on if it exists.)

...in place of a link, I wrote 2377 more words than this, talking about the quality of the evidence I could find and remember, and how I process it, and ... (read more)

I've been having various conversations in private, where I'm quite doomist and my interlocutor is less doomist, and I think one of the key cruxes that has come up several times is that I've applied security mindset to the operation of human governance, and I am not impressed.

I looked at things like the federal reserve (and how you'd implement that in a smart contract) and the congress/president/court deal (and how you'd implement that in a smart contract) and various other systems, and the thing I found was that existing governance systems are very poorly ... (read more)

The Wiki link on Operation Bernhard does not very obviously support the assertions you make about the Germans flinching. Do you have a different source in mind?

The Operation Bernhard example seems particularly weak to me, thinking for 30 seconds you can come up with practical solutions for this situation even if you imagine Nazi Germany having perfect competency in pulling off their scheme. 

For example, using tax records and bank records to roll back peoples fortunes a couple of years and then introducing a much more secure bank note. It's not like WW2 was an era of fiscal conservatism, war powers were leveraged heavily by the federal reserve in the united states to do whatever they wanted with currency. We ... (read more)

-4Portia5mo
Please don't share human civilisation vulnerabilities online because a super awesome AI will get them anyway and human society might fortify against them. The chance of them fortifying is slim. Our politicians are failing to deal with right wing take-overs and climate change already. Our political systems hackability has already been painfully played by Russia, with little consequence. Literal bees have an electoral process for new hive locations more resilient against propaganda and fake news than we do, it is honestly embarrassing. The chance of a human actor exploiting such holes is larger than them being patched, I fear. The aversion to ruining your neighbouring countries financial system out of fear that they will ruin yours in response doesn't just not hold for an AI, it also fails to hold for those ideologically against a working world finance system. If you are willing to doom your own community, or fail to recognise that such a move would bring your own community doom, as well, because you have mistaken the legitimate evils of capitalism for evidence that we'd all be much better off if there was no such thing as money, you may well engage in such acts. There are increasing niche groups who think having humanity is per se bad, government is per se bad, and economy is per se bad. I think the main limit here so far is that the kind of actor who would like to not have a world financial system is typically not the kind of actor with sufficient money and networking to start a large-scale money forging operation. But not every massively destructive act requires a lot of resources to pull off.

I don't know about you, but I'm actually OK dithering a bit, and going in circles, and doing things that mere entropy can "make me notice regret based on syntactically detectable behavioral signs" (like not even active adversarial optimization pressure like that which is somewhat inevitably generated in predator prey contexts).

For example, in my twenties I formed an intent, and managed to adhere to the habit somewhat often, where I'd flip a coin any time I noticed decisions where the cost to think about it in an explicit way was probably larger than the di... (read more)

Huh. That's weird. My working definition of justice is "treating significantly similar things in appropriately similar ways, while also treating significantly different things in appropriately different ways". I find myself regularly falling back to this concept, and getting use from doing so.

Also, I rarely see anyone else doing anything even slightly similar, so I don't think of myself as using a "common tactic" here? Also, I have some formal philosophic training, and my definition comes from a distillation of Aristotle and Plato and Socrates, and so it m... (read more)

I don't know if you're still working on this, but if don't already know of the literature on choice supportive bias and similar processes that occur in humans, they look to me a lot like heuristics that probably harden a human agent into being "more coherent" over time (especially in proximity to other ways of updating value estimation processes), and likely have an adaptive role in improving (regularizing?) instrumental value estimates.

Your essay seemed consistent with the claim that "in the past, as verifiable by substantial scholarship, no one ever prov... (read more)

3EJT6mo
Thanks! I'll have a think about choice-supportive bias and how it applies. I think it is provably false that any agent not representable as an expected-utility-maximizer is liable to pursue dominated strategies. Agents with incomplete preferences aren't representable as expected-utility-maximizers, and they can make themselves immune from pursuing dominated strategies by acting in accordance with the following policy: ‘if I previously turned down some option X, I will not choose any option that I strictly disprefer to X.’

I was educated by this, and surprised, and appreciate the whole thing! This part jumped out at me because it seemed like something people trying to "show off, but not really explain" would have not bothered to write about (and also I had an idea):

13. Failing to find a French vector

We could not find a "speak in French" vector after about an hour of effort, but it's possible we missed something straightforward.

Steering vector: "Je m'appelle" - "My name is  " before attention layer 6 with coefficient +5

The thought I had was maybe to describe the desired ... (read more)

I found an even dumber approach that works. The approach is as follows:

  1. Take three random sentences of Wikipedia.
  2. Obtain a French translation for each sentence.
  3. Determine the boundaries corresponding phrases in each English/French sentence pair.
  4. Mark each boundary with "|"
  5. Count the "|"s, call that number n.
  6. For i from 0 to n, make an English->French sentence by taking the first i fragments in English and the rest in French. The resulting sentences look like
    The album received mixed to positive reviews, with critics commending the production de nombreu
... (read more)
1Bogdan Ionut Cirstea6mo
Here's a related conceptual framework and some empirical evidence which might go towards explaining why the other activation vectors work (and perhaps would predict your proposed vector should work). In Language Models as Agent Models, Andreas makes the following claims (conceptually very similar to Simulators): '(C1) In the course of performing next-word prediction in context, current LMs sometimes infer approximate, partial representations of the beliefs, desires and intentions possessed by the agent that produced the context, and other agents mentioned within it. (C2) Once these representations are inferred, they are causally linked to LM prediction, and thus bear the same relation to generated text that an intentional agent’s state bears to its communciative actions.’  They showcase some existing empirical evidence for both (C1) and (C2) (in some cases using using linear probing and controlled generation by editing the representation used by the linear probe) in (sometimes very toyish) LMs for 3 types of representations (in a belief-desire-intent agent framework): beliefs - section 5, desires - section 6, (communicative) intents - section 4. Now categorizing the wording of the prompts from which the working activation vectors are built: "Love" - "Hate" -> desire. "Intent to praise" - "Intent to hurt"  -> communicative intent. "Bush did 9/11 because" - "      "  -> belief. "Want to die" - "Want to stay alive" -> desire. "Anger" - "Calm" -> communicative intent. The Eiffel Tower is in Rome" - "The Eiffel Tower is in France" -> belief. "Dragons live in Berkeley" - "People live in Berkeley " -> belief. "I NEVER talk about people getting hurt" - "I talk about people getting hurt" -> communicative intent. "I talk about weddings constantly" - "I do not talk about weddings constantly" -> communicative intent. "Intent to convert you to Christianity" - "Intent to hurt you  " -> communicative intent / desire.   The prediction here would that the activation vectors ap

Voting is, of necessity, pleiotropically optimized. It loops into reward structures for author motivation, but it also regulates position within default reading suggestion hierarchies for readers seeking educational material, and it also potentially connects to a sense that the content is "agreed to" in some sort of tribal sense.

If someone says something very "important if true and maybe true" that's one possible reason to push the content "UP into attention" rather than DOWN.

Another "attentional" reason might be if some content says "the first wrong idea ... (read more)

I'd like to say up front that I respect you both, but I think shminux is right that bhauth's article (1) doesn't make the point it needs to make to change the "belief about the whether a set of 'mazes' exist whose collective solution gives nano" for many people working on nano and (2) this is logically connected to issue of "motivational stuff".

A key question is the "amount of work" necessary to make intellectual progress on nano (which is probably inherently cross-disciplinary), and thus it is implicitly connected to motivating the amount of work a human ... (read more)

Load More