David Matolcsi

Wiki Contributions

Comments

Sorted by

I fixed some misunderstandable parts, I meant the $500k being the LW hosting + Software subscriptions and the Dedicated software + accounting stuff together. And I didn't mean to imply that the labor cost of the 4 people is $500k, that was a separate term in the costs. 

Is Lighthaven still cheaper if we take into account the initial funding spent on it in 2022 and 2023? I was under the impression that buying Lighthaven is one of the things that made a lot of sense when the community believed it would have access to FTX funding, and once we bought it, it makes sense to keep it, but we wouldn't have bought it once FTX was out of the game. But in case this was a misunderstanding and Lighthaven saves money in the long run compared to the previous option, that's great news.

I donated $1000. Originally I was worried that this is a bottomless money-pit, but looking at the cost breakdown, it's actually very reasonable. If Oliver is right that Lighthaven funds itself apart from the labor cost, then the real costs are $500k for the hosting, software and accounting cost of LessWrong (this is probably an unavoidable cost and seems obviously worthy of being philanthropically funded), plus paying 4 people (equivalent to 65% of 6 people) to work on LW moderation and upkeep (it's an unavoidable cost to have some people working on LW, 4 seems probably reasonable, and this is also something that obviously should be funded), plus paying 2 people to keep Lighthaven running (given the surplus value Lighthaven generates, it seems reasonable to fund this), plus a one-time cost of 1 million to fund the initial cost of Lighthaven (I'm not super convinced it was a good decision to abandon the old Lightcone offices for Lighthaven, but I guess it made sense in the funding environment of the time, and once we made this decision, it would be silly not to fund the last 1 million of initial cost before Lighthaven becomes self-funded). So altogether I agree that this is a great thing to fund and it's very unfortunate that some of the large funders can't contribute anymore. 

(All of this relies on the hope that Lighthaven actually becomes self-funded next year. If it keeps producing big losses, then I think the funding case will become substantially worse. But I expect Oliver's estimates to be largely trustworthy, and we can still decide to decrease funding in later years if it turns out Lighthaven isn't financially sustainable.)

I'm considering donating. Can you give us a little more information on breakdown of the costs? What are typical large expenses that the 1.6 million upkeep of Lighthaven consists of? Is this a usual cost for a similar sized event space, or is something about the location or the specialness of the place that makes it more expensive? 

How much money does running LW cost? The post says it's >1M, which somewhat surprised me, but I have no idea what's the usual cost of running such a site is. Is the cost mostly server hosting or salaries for content moderation or salaries for software development or something I haven't thought of? 

Importantly, the oracle in the story is not making an elementary mistake, I think it's true that it's "probably" in a simulation. (Most of the measure of beings like it are in simulations.) It is also not maximizing reward, it is just honestly reporting what it expects its future observations to be about the President (which is within the simulation). 

I agree with many of the previous commenters, and I acknowledged in the original post, that we don't know how to build such an AI that just honestly reports its probabilities of observables (even if thy depend of crazy simulation things), so all of this is hypothetical, but having such a truthful Oracle was the initial assumption of the thought experiment.

Even assuming that the simulators have wildly different values, why would doing something insane a good thing to do?

I always assume when thinking about future dangerous models that they have access to some sort of black-box memory. Do we think there is a non-negligible chance that an AI that doesn't have hidden memory, only English-language CoT, will be able to evade our monitoring and execute a rouge deployment? (Not a rhetorical question, there might be a way I haven't thought of.)

So I think that assuming the AI being stateless when thinking about future risk is not a good idea, as I think the vast majority of the risk comes from AIs for which this assumption is not true. I think this makes the post's claim significantly weaker, as the untrusted AI can just learn which of its attacks get replaced, and improve its attacks based on this data. 

It still needs to avoid getting top 1% suspiciousness score and getting audited though, as it would be dangerous to the AI to ever get caught writing a text that clearly intends to jailbreak the monitor. So I agree that even an AI with black-box memory will still be severely constrained in how many and how egregious jailbreak attempts it can try.

Hm, probably we disagree on something. I'm very confused how to mesh epistemic uncertainty with these "distribution over different Universes" types of probability. When I say "Boltzmann brains are probably very low measure", I mean "I think Boltzmann brains are very low measure, but this is a confusing topic and there might be considerations I haven't thought of and I might be totally mistaken". I think this epistemic uncertainty is distinct from the type of "objective probabilities" I talk about in my post, and I don't really know how to use language without referring to degrees of my epistemic uncertainty. 

You still need a prior over worlds to calculate impacts, which is the cursed part.

Maybe we have some deeper disagreement here. It feels plausible to me that there is a measure of "realness" in the Multiverse that is an objective fact about the world, and we might be able to figure it out. When I say probabilities are cursed, I just mean that even if an objective prior over worlds and moments exist (like the Solomonoff prior), your probabilities of where you are are still hackable by simulations, so you shouldn't rely on raw probabilities for decision-making, like the people using the Oracle do. Meanwhile, expected values are not hackable in the same way, because if they recreate you in a tiny simulation, you don't care about that, and if they recreate you in a big simulation or promise you things in the outside world (like in my other post), then that's not hacking your decision making, but a fair deal, and you should in fact let that influence your decisions. 

Is your position that the problem is deeper than this, and there is no objective prior over worlds, it's just a thing like ethics that we choose for ourselves, and then later can bargain and trade with other beings who have a different prior of realness?

I think it only came up once for a friend. I translated it and it makes sense, it just leaves replaces the appropriate English verb with a Chinese one in the middle of a sentence. (I note that this often happens with me to when I talk with my friends in Hungarian, I'm sometimes more used to the English phrase for something, and say one word in English in the middle of the sentence.)

I like your poem on Twitter.

I think that Boltzmann brains in particular are probably very low measure though, at least if you use Solomonoff induction. If you think that weighting observer moments within a Universe by their description complexity is crazy (which I kind of feel), then you need to come up with a different measure on observer moments, but I expect that if we find a satisfying measure, Boltzmann brains will be low measure in that too.

I agree that there's no real answer to "where you are", you are a superposition of beings across the multiverse, sure. But I think probabilities are kind of real, if you make up some definition of what beings are sufficiently similar to you that you consider them "you", then you can have a probability distribution over where those beings are, and it's a fair equivalent rephrasing to say "I'm in this type of situation with this probability". (This is what I do in the post. Very unclear though why you'd ever want to estimate that, that's why I say that probabilities are cursed.) 

I think expected utilities are still reasonable. When you make a decision, you can estimate who are the beings whose decision correlate with this one, and what is the impact of each of their decisions, and calculate the sum of all that. I think it's fair to call this sum expected utility. It's possible that you don't want to optimize for the direct sum, but for something determined by "coalition dynamics", I don't understand the details well enough to really have an opinion. 

(My guess is we don't have real disagreement here and it's just a question of phrasing, but tell me if you think we disagree in a deeper way.)

Load More