That's a shame. Unreliable notifications is a very strong poison. Undeniability of reciept/solving the byzantine generals problem is like, fundamental to all coordination problems.
I think this design would be good.
I'm working on the same problem of improving discussion and curation systems with Tasteweb. I focus more on making it easier to extend or revoke invitations with transparency and stronger support for forking/subjectivity. I'm hoping that if you make it easy to form and maintain alternative communities, it'll become obvious enough that some of them are much more good faith/respectful/sincerely interested in what others are saying, and that would also pretty much solve deduplication.I think in reality, it's too much labor, a... (read more)
What sorts of things, that you would want preserved, or that the future would find interesting, would not be captured by that?
I agree that there doesn't seem to be a theory, and there are many things about the problem that makes reaching any level of certainty about it impossible (the we can only have one sample thing). I do not agree that there's a principled argument for giving up looking for a coherent theory.
I suspect it's going to turn out to be like it was with priors about the way the world is: Lacking information, we have just fall back on solomonoff induction. It works well enough, and it's all we have, and it's better than nothing.
So... oh... we can define priors about ... (read more)
A fun thing about example 1. is that we can totally imagine an AF System that could drag a goat off a cliff and eat it (put it in a bioreactor which it uses to charge its battery), it's just that no one would make that, because it wouldn't make sense. Artificial systems use 'cheats' like solar power or hydrocarbons because the cheats are better. There may never be an era or a use case where it makes sense to 'stop cheating'.
A weird but important example is that you might not ever see certain (sub-pivotal) demonstrations of strength from most AGI researcher institutions, not because they couldn't make those things, but because doing so would cause them to be nationalized as a defense project.
Ack. Despite the fact that we've been having the AI boxing/infohazards conversation for like a decade I still don't feel like I have a robust sense of how to decide whether a source is going to feed me or hack me. The criterion I've been operating on is like, "if it's too much smarter than me, assume it can get me to do things that aren't in my own interest", but most egregores/epistemic networks, which I'm completely reliant upon, are much smarter than me, so that can't be right.
This depends on how fast they're spreading physically. If spread rate is close to c, I don't think that's the case, I think it's more likely that our first contact will come from a civ that hasn't received contact from any other civs yet (and SETI attacks would rarely land, most civs who hear them would either be too primitive or too developed to be vulnerable to them, before their senders arrive physically.).
Additionally, I don't think a viral SETI attack would be less destructive than what's being described.
Over time, the concept of Ra settled in my head as... the spirit of collective narcissism, where we must recognize narcissism as delusional striving towards attaining the impossible social security of being completely beyond criticism, to be flawless, perfect, unimprovable, to pursue Good Optics with such abandon, as to mostly lose sight of whatever it was you were running from.
It leads to not being able to admit to most of the org's imperfections even internally, though they may admit to that imprefection internally, doing so resigns them to it, and they submit to it.
I don't like to define it as the celebration of vagueness, in my definition that's just an entailment. Something narcissism tends to do, to hide.
I really wish that the post has been written in a way that let me figure out it wasn't for me sooner...
I think it would have saved a lot of time if the paragraph in bold had been at the top.
Whether, if you give the agent n additional units of resources, they optimize U by less than k*n. Whether the utility generated per unit of additional space and matter slower than a linear function. Whether there are diminishing returns to resources. An example of a sublinear function is the logarithm. An example of a superlinear function is the exponential.
All that's really required is storing data, maybe keeping it encrypted for a while, and then decrypting it and doing the right thing with it once we're grown.
We pretty much do have a commitment framework for indefinite storage, it's called Arweave. Timed decryption seems unsolved(at least, Vitalik asserted that it is on a recent epicenter podcast, also, interestingly, asserted that if we had timed decryption, blockchains would be dramatically simpler/MEV-proof/bribe-proof, I assume because it would allow miners to commit hashes before knowing what they rep... (read more)
There are other subcases of reward hacking this wouldn't cover, though. Let's call the misaligned utility function U.
An interesting related question would be... should we also punish non-confession. Default attitude around here seems to be that we pre-commit to ignore punishments, and so we would expect AGI to do the same, but I don't know what that assumption rests on. A relevant article would be Diffractor's threat resistant bargaining megapost.
Assuming the existence of remotely probable (from the perspective of the AGI) detection of misalignment, then yes, there are easily imaginable cases where it would benefit us a lot to have this policy, and where it would benefit a misaligned AGI to confess.
Namely, most cases of reward hacking are helped by this, essentially because incentivizing a reward-hacker to comply is cheap. Reward hacking is an easily foreseeable attractor in cases of misalignment, so it's also worth preparing for.
Reward Hacking is when instead of internalizing the reward function, ... (read more)
Two good forks to consider reaching out to are Hometown and Ecko.
I'm pretty certain that mastodon knows that is the wrong thing, it's just a low-energy project. The author of the underlying protocol, ActivityPub, is brilliant, though, and she's been working on foundational stuff for a new set of protocols, spritely, oriented around capabilities, distributed transactions and racket. Judging from her promotion of the petname system I'm guessing she's going to think of just making user IDs content-hashes to an object with a public key and an update rule for account recovery. IIRC a content-addressed distributed stora... (read more)
Yeah, but I didn't post it again or anything. There was a bug with the editor, and it looks like the thing Ruby did to fix it caused it to be resubmitted.
I'd say the diagram I've added isn't quite good enough to warrant resubmitting it, but there is a diagram that would have been. This always really needed a diagram.
Alarmed that with all this talk of anthropics thought experiments and split brains you don't reference Eliezer's Ebborian's Story. (Similarly, when I wrote my anthropics thought experiment (Mirror Chamber (as far as I can tell it's orthogonal and complementary to your point)) I don't think I remembered having read the Ebborians story, I'm really not sure I'd heard of it).
the psychological yearning for achievement, sacrifice, and victory
Yeah, how do I find a place in the social fabric and build enough of a sense of self-esteem to put myself out there if almost all work that I admire is done by a very small percentage of freakishly talented IP producers that I'm unlikely to ever be a part of? And even if I am going to be one of them, how do I survive the 25 years of hazing society puts me through before I make it there?
Entertainment (enjoyment, the authentic self) and socializing separated. People watched TV alone, and then people gamed alone and then people scrolled social media together but not really and they were still basically alone.
I think it's fairly likely that the pattern will be interrupted with VR. The reasons are dumb. VR headsets package in good microphones and spacialized audio makes talking in groups fun again. We'll start to see better team games, or just social venues where people mix and talk for its own sake. Then the social online world will be a lot more enticing.
I very often have to do something like this just to get out of bed in the morning.Until I remember good, and the world, and what I can do for it, I wont want to move. And sometimes I forget to do this, and so I don't move.
I'm not sure we've ever had cooperators that weren't subjects within agents.
In every case I can think of, systems of cooperators (eukaryotic cells, tribes, colonies) arose with boundaries, that distinguish them from their conspecifics, predators, or competing groups, who they were in fierce enough competition with that the systems of cooperators needed to develop some degree of collective agency to survive.
I think the tendency for sub-agents within a system to become highly cooperative is interesting and worth talking about though.
It's not obvious how to..... (read more)
[relates this to my non-veganism] Oh no.
That said, hot air balloons did have some practical applications:Military Ballooning—Hot air balloons can be used for reconnaissanceCartography—By providing an aerial view, balloons are useful for more accurate mapmakingWeather Observation—Balloons can be used to gain knowledge about parts of the atmosphere inaccessible from the ground
That said, hot air balloons did have some practical applications:
Tenses my rationalfic worldbuilder brain. Outputs: so I wonder if we'd have better international transparency cultures (faster surrenders/treaties, and fewer arms-races, and so a much greater chance of solving the alignment problem) as a result of the legacy of a long history of military ballooning for reconnaissance.
They already have. I have no idea why zoom failed to solve this problem and vrchat succeeded, but it is so. (might be something to do with gamedevs having a better understanding of how to trick content-neutral networks into prioritizing realtime applications, I've seen some of them writing about that.)
You already pretty much have eye contact, or like, head contact, it's good enough for conveying who you're currently listening to, which covers the practical dimension, and your avatar will guess who you're looking at and make eye contact with them so the emo... (read more)
I wrote more about this https://forum.effectivealtruism.org/posts/K3JZCspQMJA34za3J/most-social-activity-will-reside-in-vr-by-2036
It's not going to be very transformative until it has widespread adoption, but it will, very soon.
Is it some mix of the size of the economy and the population size? Is that a good proxy for military strength? It's not quite solely about economy yet.
Feedback: I think the first thing I'm going to want to do with this is hand-draw some of my probability distributions, and it looks like it doesn't come with a thing for that? (Should I not want this feature? Am I misunderstanding something about what doing practical statistics is like?)
I saw those scores and thought I was about to witness the greatest exchange of constructive contrarianism in the history of the forums. (Pretty proud of a +7 -12 I posted recently. A real fine stinker. The blue cheese of comments.)
I guess the process would be to pass it on to whichever cababilities researchers they trust with it. There would be a few of them at this point.
So, why not go straight to those researchers instead of MIRI? Because MIRI are more legible responsible intermediaries I guess.
One major second-order effect of doing something this dramatic is that you'd expect controls on gene editing technologies to be raised a lot/made at all, and an argument could be made that that would be a good thing.
There's a tendency to think: If we believe that something should be illegal, we shouldn't do it ourselves. In competitive arenas, this ends up disadvantaging the most responsible thinkers by denying them the fruits of defection without denying it to their competitors, or suppressing the acknowledgement of the regulatory holes as participants ar... (read more)
A transferrable utility game is one where there's a single resource (like dollars) where everyone's utility is linear in that resource
For humans, money does not seem to have linear returns of utility. For what real agents could it?
My expectation for the U of an aligned AGI would be something like, the sum of the desires of humans, which, if the constituent terms have diminishing returns on resources, will also be diminishing. I can see arguments that many probable unaligned AGI might get linear returns on resources... but if humanity is involved in the neg... (read more)
Can you explain how it picks them off one by one? I mean, how large a group do you need to pick off a wolf and wouldn't most people be close to being in a group of that size naturally as a result of uh having a town.
I'm not seeing any (sorry I missed a word) much game design here.
My experience as a designer, building out a genre of "peacewagers" (games that aren't zero sum but also aren't strictly cooperative, the set of games where honest negotiation is possible.), is that it actually is very likely that someone who's mostly worked in established genres would drastically underestimate the amount of design thought that is required to make a completely new kind of game work, and they're trying to make a new kind of game, so I wouldn't be surprised if they just fell ove... (read more)
I see that this is getting quite a lot of agreement points. I would also like to add my agreement, this is probably a true quote. I agree that it's probably a true quote. Your claim that this was written somewhere is probably true.
I'd guess that the main AI-exacerbating thing that the game industry does is provoke consumers to subsidize hardware development. I don't know if this is worth worrying about (have you weighed the numbers?), but do you plan on like, promoting low-spec art-styles to curb demand for increasing realism? :] I wonder if there's a tension between realism and user-customizability that you might be able to inflame (typically, very detailed artstyles are more expensive to work in and are harder to kitbash, but it's also possible that stronger hardware would simplify asset pipelines in some ways: raytracing could actually simplify a lot of lighting stuff, right?).
Wolves sometimes kill more than they need, actually. It's quite strange. So they could be normal-sized wolves. And I'm imagining this to be a population of conservationists who aren't interested in taking them out of the local ecosystem.
I'm trying to figure out the worldbuilding logic of "they didn't come so they all got eaten". What do they do when they come? Why would they be less likely to get eaten if they don't do it? And also, how does the boy only have a 5% probability?
Okay so maybe the boy sees the wolf from a distance, on a particular bridge or in... (read more)
It makes non-web applications possible. It has a better layout system, rendering system. Animates everything properly. Centers Dart, which seems to be a pretty good language: It can be compiled ahead of time for faster boots (although I'm not completely sure that typescript wont be basically just as compilable once wasm-gc is up), has better type reflection, will have better codegen (already supports annotations), has a reasonable import system, better data structures, and potentially higher performance due to the type system not being an afterthought (although typescript is still very good relative to dart).
My impression was that wave failed because it was slow and miserable to use. Maybe it would have failed later on for your reason as well, but this was the reason it failed for me.
The great and mighty Google didn't actually have the ability to make a fairly complex UI for the web that could scale to 100 items. As of today, the craft of UI programming has been lost to all but a few Qt programmers. Google are now gradually rebuilding the art, with Flutter, and I think they may succeed, and this will have a shocking quantity of downstream consequences.
Arbital was functional and fine. It only failed at the adoption stage for reasons that're still mysterious to me. I'm reluctant to even say that it did fail, I still refer people to a couple of the articles there pretty often.
Separating morality and epistemics is not possible because the universe contains agents who use morality to influence epistemological claims, and the speaker is one of them. I wrote up a response to this post with a precise account of what "should" is and how it should be used. Precise definitions also solve these problems. Looking back today, I think my post introduces new problems of its own. I don't know when I will finish it. For now, in case I never do finish it, I should mention the best parts here. I don't believe or endorse all these claims, but mi... (read more)
The reward function that you wrote out is, in a sense, never the one you want them to have, because you can't write out the entirety of human values.
We want them to figure out human values to a greater level of detail than we understand them ourselves. There's a sense in which that (figuring out what we want and living up to it) could be the reward function in the training environment, in which case you kind would want them to stick with it.
But what would that [life, health and purpose] be for AGI?
Just being concerned with the broader world and its role in... (read more)
The creative capacities for designing score optimization or inductive reasoning games, as they sit in my hands, look to be about the same shape as the creative capacities for designing a ladder of loss functions that steadily teach self-directed learning and planning.Score optimization and induction puzzles are the genres I'm primarily interested in as a game designer, that feels like a very convenient coincidence, but it's probably not a coincidence. There's probably just some deep correspondence between the structured experiences that best support enrich... (read more)
And you haven't been able to reset your tolerance with a break? Or would it not be worth it? (I can't provide any details about what the benefits would be sry)
Why work your way up at all? The lower you can keep your tolerance, the better, I'd guess?
I don't intend on ever switching away from my sencha/japanese green tea.
Given this as a foundation, I wonder if it'd be possible to make systems that report potentially dangerously high concentrations of compute, places where an abnormally large amount of hardware is running abnormally hot, in an abnormally densely connected network (where members are communicating with very low latency, suggesting that they're all in the same datacenter).
Could it be argued that potentially dangerous ML projects will usually have that characteristic, and that ordinary distributed computations (EG, multiplayer gaming) will not? If so, a system like this could expose unregistered ML projects without imposing any loss of privacy on ordinary users.
the less readable your posts become because the brain must make a decision with each link whether to click it for more information or keep reading. After several of these links, your brain starts to take on more cognitive load
I don't think it's reasonable to try to avoid the cognitive load of deciding whether to investigate subclaims or follow up on interesting ledes while reading. I think it's a crucial impulse for critical thinking and research and we have to have it well in hand.
Wondering if radical transparency about (approximate) wealth + legalizing discriminatory pricing would sort of steadily, organically reduce inequality to the extent that would satisfy anyone.
Price discrimination is already all over the place, people just end up doing it in crappy ways, often by artificially crippling the cheaper versions of their products. If they were allowed to just see and use estimates of each customer's wealth or interests, the incentives to cripple cheap versions would become negative, perhaps more people would get the complete featu... (read more)
Since everything can fit into the "agent with utility function" model given a sufficiently crumpled utility function, I guess I'd define "is an agent" as "goal-directed planning is useful for explaining a large enough part of its behavior." This includes humans while discluding bacteria. (Hmm unless, like me, one knows so little about bacteria that it's better to just model them as weak agents. Puzzling.)
Most of what people call morality is conflict mediation: techniques for taking the conflicting desires of various parties and producing better outcomes for them than war.That's how I've always thought of the alignment problem. The creation of a very very good compromise that almost all of humanity will enjoy.
There's no obvious best solution to value aggregation/cooperative bargaining, but there are a couple of approaches that're obviously better than just having an arms race, rushing the work, and producing something awful that's nowhere near the average human preference.