All of Dagon's Comments + Replies

Mostly agree, but also caution about being too confident in one's skepticism.  Almost all innovation is stupid until it works, and it's VERY hard to know in advance which problems end up being solvable or what new applications come up when something is stupid for it's obvious purpose but a good fit for something else.

I honestly don't know which direction this should move your opinion of Hyundai's research agenda.  Even if (as seems likely), it's not useful in car manufacturing, it may be useful elsewhere, and the project and measurement mechanisms may teach them/us something about the range of problems to address in drivetrain design.

I know that's a common saying, but I don't think I agree. Were smaller transistors a stupid idea until they worked? And then, there are some "good ideas" that were in a sense still "stupid" for a while after they started working, like some impractical early rocket weapons. I don't think it should affect your opinion of "Hyundai's research agenda" much at all. This is just normal.

On reflection, I suspect that I'm struggling with the is-ought problem in the entire project.  Physics is "is" and ethics is "ought", and I'm very skeptical that "ethicophysics" is actually either, let alone a bridge between the two.

That's fair (strong up/agree vote). If you consult my recent shortform, I lay out a more measured, skeptical description of the project. Basically, ethicophysics constitutes a globally computable Schelling Point, such that it can be used as a protocol between different RL agents that believe in "oughts" to achieve Pareto-optimal outcomes. As long as the largest coalition agrees to prefer Jesus to Hitler, I think (and I need to do far more to back this up) defectors can be effectively reined in, the same way that Bitcoin works because the majority of the computers hooked up to it don't want to destroy faith in the Bitcoin protocol.

I understand (but do not agree with) the idea of preserving someone's clickstream.  I do not want pure linkposts without any information on LessWrong.  The equation is:

V = mc2

(Violence) = (Mass of animals) x (Degree of Confinement)2

and the solution is

Quite simply: fight violence with kindness.

Fighting violence with kindness is a great plan when the entity on the other side is capable of responding to kindness. Which, most are, as long as it's framed appropriately. Kindness should be the first resort, and violence the last, but a complete unwillingness to use violence can also lead to more net suffering by yielding power to those without such compunctions.
2[comment deleted]1d
That's fair. I just want Wayne to get out of jail soon because he's a personal friend of mine.

I suspect we have a disagreement about whether the "worked out theoretical equations" suffer from is-ought any less than the plain language version.  And if they are that fundamentally different, why should anyone think the equations CAN be explained in plain language.

I am currently unwilling to put in the work to figure out what the equations are actually describing.  If it's not the same (though with more rigor) as the plain language claims, that seriously devalues the work.

Check out my post entitled "Enkrateia" in my sequence. This is a plain language account of a safe model-based reinforcement learner using established academic language and frameworks.
Answer by DagonNov 29, 202321

As others have pointed out, there's an ambiguity in the word "you". We don't have intuitions about branching or discontinuous memory paths, so you'll get different answers if you mean "a person with the memories, personality, and capabilities that are the same as the one who went into the copier" vs "a singular identity experiencing something right now".  

Q1: 100%.  A person who feels like me experiences planet A and a different person who is me experiences planet B. 

Q2: Still 100%.  One of me experiences A, one C and one D. 

Q3: Co... (read more)

Do you actually want discussion on LW, or is this just substack spam?  If you want discussion, you probably should put something to discuss in the post itself, rather than a link to a link to a PDF in academic language that isn't broken out or presented in a way that can be commented upon.

From a very light skim, it seems like your "mathematically rigorous treatment" isn't.  It includes some equations, but not much of a tie between the math and the topics you seem to want to analyze.  It deeply suffers from is-ought confusion.

Also, any solution to the alignment problem must suffer from is-ought confusion when presented in plain language rather than extensively worked out theoretical equations with extensive empirical verification.  Which part would you have me remove, the plain language, the extensively worked out theoretical equations, or my list of open problems that I hope people will use to help me assemble extensive empirical verification of my work?
I actually want discussion on LW. I'll post my list of open questions as a comment on this post and encourage people to respond to it by taking a crack at them.

[ note: I am not a libertarian, and haven't been for many years. But I am sympathetic. ]

Like many libertarian ideas, this mixes "ought" and "can" in ways that are a bit hard to follow.  It's pretty well-understood that all rights, including the right to redress of harm, are enforced by violence.  In smaller groups, it's usually social violence and shared beliefs about status.  In larger groups, it's a mix of that, and multi-layered resolution procedures, with violence only when things go very wrong.  

When you say you'd "prefer a world c... (read more)

I'm very skeptical of fairly limited experiences being used to make universal pronouncements.

I'm sure this was the experience for many individuals and teams.  I know for certain it was pretty normal and not worried about for others.  I knew a lot of MS employees in that era, though I worked at a different giant tech firm with vaguely-similar procedures.  I was senior enough, though an IC rather than a manager, to have a fair bit of input into evaluations of my team and division, and I  saw firsthand the implementation and effects of thi... (read more)

Voting is one example. Who gets "human rights" is another. A third is "who is included, with what weight, in the sum over well being in a utility function". A fourth is "we're learning human values to optimize them: who or what counts as human"? A fifth is economic fairness,

I think voting is the only one with fairly simple observable implementations.  The others (well, and voting, too) are all messy enough that it's pretty tenuous to draw conclusions about, especially without noting all the exceptions and historical violence that led to the current st... (read more)

I make that point at length in Part 3.

[ epistemic status: I don't agree with all the premeses and some of the modeling, or the conclusions.  But it's hard to find one single crux.  If this comment isn't helpful, I'll back off - feel free to rebut or disagree, but I may not comment further. ]

This seems to be mostly about voting, which is an extremely tiny part of group decision-making.  It's not used for anything really important (or if it is, the voting options are limited to a tiny subset of the potential behavior space).  Even on that narrow topic, it switches from a fair... (read more)

Voting is one example. Who gets "human rights" is another. A third is "who is included, with what weight, in the sum over well being in a utility function". A fourth is "we're learning human values to optimize them: who or what counts as human"? A fifth is economic fairness, I listed all of these examples, to try to point out that (as far as I can tell) pretty-much any ethical system you build has some sort of similar definition problem of who or what counts, and how much. (Even paperclip maximizing has a similar problem of defining what does and doesn't count as a paperclip.) I'm trying to discuss that problem, as a general feature in ethical system design for ethical systems designed around human values, without being too specific about the details of the particular ethical system in question. So if I somehow gave the impression that this was jusr about who gets a vote, them no, that was intended as shorthand for this larger problem of defining a set or a summation. As for the level of rationality, for the most part, I'm discussing high-tech future societies that include not just humans but also AIs, some of them superhuman. So yes, I'm assuming more rationality than typical for current purely-human societies. And yes, I'm also trying to apply the methods of rationality, or at least engineering design, to an area that has generally been dominated by politics, idealism, and religion. Less Wrong seemed like a reasonable place to attempt that.

Sorry, kind of bounced off the part 1 - didn't agree, but couldn't find the handle to frame my disagreement or work toward a crux.  Which makes it somewhat unfair (but still unfortunately the case) to disagree now.

I like the focus on power (to sabotage or defect) as a reason to give wider voice to the populace.  I wonder if this applies to uploads.  It seems likely that the troublemakers can just be powered down, or at least copied less often.

So which aspect(s) of part 1 didn't you agree with? (Maybe we should have a discussion there?)

I suspect your modeling of “the fairness instinct” is insufficient. Historically, there were many periods of time where slaves or mostly-powerless individuals were the significant majority. Even today, there are very limited questions where one-person-one-vote applies. Even in the few cases where that mechanism holds, ZERO allow any human (not even any embodied human) to vote. There are always pretty restrictive criteria of membership and accident of birth that limit the eligible vote population.

As I discuss in Part 1 A Sense of Fairness, societies have been becoming distinctly more egalitarian over the last few centuries. My suggestion is that having a large oppressed class has been becoming less viable as technology improved. As social and technological complexity increases, sabotage becomes more effective. This effect is going to become more and more the case as weapons of mass destruction become available to terrorists, revolutionaries, and anyone else sufficiently upset about their lot in life. A high-tech society needs to be egalitarian, because it can't afford to have even small numbers of highly disaffected people with the technical skill to cause massive damage.

Without examples, I have trouble understanding "censorship of independent-minded people".  It's probably not formal censorship (but maybe it is - most common media disallows some words and ideas).  There's a big difference between "negative reactions to beliefs that many/most find unpleasant, even if partially true" and "negative reactions to ideas that contradict common values, with no real truth value". They're not the same motives, and not the same mechanisms for the idea-haver to refine their beliefs.  

In many groups, especially public o... (read more)

Downvoted.  This states an overgeneral concept far more forcefully than it deserves, and doesn't give enough examples to know what kind of exceptions to look for.  I'm also unsure what "censure" means specifically in this model of things - is my comment a censure?

I also dislike the framing of "conventional-minded" vs "independent-minded" as attributes of people, rather than as descriptions of topics that bring criticism.  This could be intentional, if you're arguing that the kind of censure you're talking about tends to be directed at people rather than ideas, but it's not clear if so.

Your comment is not a censure of me. I didn't feel the need to distinguish between censorship of ideas and censorship of independent-minded people, because censorship of ideas censors the independent-minded. I deliberately avoided examples for the same reason Paul Graham's What You Can't Say deliberately avoids giving any specific examples: because either my examples would be mild and weak (and therefore poor illustrations) or they'd be so shocking (to most people) they'd derail the whole conversation.

Not really an answer, but a few modeling considerations:

  • "Elites" are a difficult group to talk about - it's not uniform in capabilities or desires.  It's more specific to say "commodities traders" or "fund managers" or whatever group you are actually analyzing.
  • Traders, analysists, and other market participants have used ML for a long long time in their work.  It's not clear what about "AI" you think is different enough to justify a large change.

At a simple calculation, $19B USD for 118M expected signatures (of different types) is $161 per signature.  This contradicts the article which says 1.5 Francs or 2-4 Francs.  However, it's also "2 to 17" Billion Francs, depending on actual usage.  Still doesn't add up.

I have no clue what's actually included in the price - digitization and indexing/retrieval of documents can cost a lot more than just the identity verification.  And legally-binding identity verification ain't cheap in the first place.

It does seem high to me, but I can say that about almost all government spending, for any country for any program.

I can't tell if you're saying "this is completely and horribly incorrect in approach and model", or if you're saying "yeah, there are cases where imposed rapid change is harmful, but there's nuance I'd like to point out".  I disagree with the former, and don't see the latter very clearly in the text.

The title of Scott's post (give up 70 percent of the way through) seems about right to me, and skimming over the post, it seems he's mostly talking about extreme, rapid, politically-motivated changes.  I agree with him that it's concerning, and the vi... (read more)

Out of the two options, this is closer to my view: I think Scott’s model of how changes in the words we use for minority groups happen is just factually inaccurate and unrealistic. Changes are generally slow, gradual, long-lasting, and are primarily advocated for in good faith by conscientious members of the minority group in question.
Well, my examples are both real and non-fringe, whereas "Asian" and "field work" are fictional and fringe, respectively. So, I think "gay" and "Black" are more central examples. Scott also seems annoyed by "Black", but doesn't explain why he's (seemingly) annoyed. There's a bit more here than I can readily respond to right now, but let me know if you think I've avoided the crux of the matter and you'd like me to address it in a future comment.

We absolutely agree that incentives matter.  Where I think we disagree is on how much they matter and how controllable they are.  Especially for orgs whose goals are orthogonal or even contradictory with the common cultural and environmental incentives outside of the org.

I'm mostly reacting to your topic sentence

EAs are, and I thought this even before the recent Altman situation, strikingly bad at setting up good organizational incentives.

And wondering if 'strikingly bad' is relative to some EA or non-profit-driven org that does it well,or if 'strikingly bad' is just acknowledgement that it may not be possible to do well.

2Garrett Baker10d
By strikingly bad I mean there are easy changes EA can make to make it’s sponsored orgs have better incentives, and it has too much confidence that the incentives in the orgs it sponsors favor doing good above doing bad, politics, not doing anything, etc. For example, nobody in Anthropic gets paid more if they follow their RSP and less of they don’t. Changing this isn’t sufficient for me to feel happy with Anthropic, but its one example among many for which Anthropic could be better. When I think of an Anthropic I feel happy with I think of a formally defined balance of powers type situation with strong & public whistleblower protection and post-whistleblower reform processes, them hiring engineers loyal to that process (rather than building AGI), and them diversifying the sources for which they trade, such that its in none of their source’s interest to manipulate them. I also claim marginal movements toward this target are often good. As I said in the original shortform, I also think incentives are not all or nothing. Worse incentives just mean you need more upstanding workers & leaders.

I'm confused.  NVidia (and most profit-seeking corporations) are reasonably aligned WRT incentives, because those are the incentives of the world around them.

I'm looking for examples of things like EA orgs, which have goals very different from standard capitalist structures, and how they can set up "good incentives" within this overall framework.  

If there are no such examples, your complaint about 'strikingly bad at setting up good organizational incentives" is hard to understand.  It may be more that the ENVIRONMENT in which they exist has competing incentives and orgs have no choice but to work within that.

0Garrett Baker11d
You must misunderstand me. To what you say, I say that you don't want your org to be fighting the incentives of the environment around it. You want to set up your org in a position in the environment where the incentives within the org correlate with doing good. If the founders of Nvidia didn't want marginally better GPUs to be made, then they hired the wrong people, bought the wrong infrastructure, partnered with the wrong companies, and overall made the wrong organizational incentive structure for that job. I would in fact be surprised if there were >1k worker sized orgs which consistently didn't reward their workers for doing good according to the org's values, was serving no demand present in the market, and yet were competently executing some altruistic goal. Right now I feel like I'm just saying a bunch of obvious things which you should definitely agree with, yet you believe we have a disagreement. I do not understand what you think I'm saying. Maybe you could try restating what I originally said in your own words?

Can you give some examples of organizations larger than a few dozen people, needing significant resources, with goals not aligned with wealth and power, which have good organizational incentives?  

I don't disagree that incentives matter, but I don't see that there's any way to radically change incentives without pretty structural changes across large swaths of society.

0Garrett Baker11d
Nvidia, for example, has 26k employees, all incentivized to produce & sell marginally better GPUs, and possibly to sabotage others' abilities to make and sell marginally better GPUs. They're likely incentivized to do other things as well, like play politics, or spin off irrelevant side-projects. But for the most part I claim they end up contributing to producing marginally better GPUs. You may complain that each individual in Nvidia is likely mostly chasing base-desires, and so is actually aligned with wealth & power, and it just so happens that in the situation they're in, the best way of doing that is to make marginally better GPUs. But this is just my point! What you want is to position your company, culture, infrastructure, and friends such that the way for individuals to achieve wealth and power is to do good on your company's goal. I claim its in nobody's interest & ability in or around Nvidia to make it produce marginally worse GPUs, or sabotage the company so that it instead goes all in on the TV business rather than the marginally better GPUs business. Edit: Look at most any large company achieving consistent outcomes, and I claim its in everyone in that company's interest or ability to help that company achieve those consistent outcomes.

A few aspects of my model of university education (in the US):

  • "Education" isn't a monolithic thing, it's a relation between student, environment, teachers, and body of material for the common conception of that degree.  Particularly good (or bad) professors can make a big difference in motivation and access to information, and can set up systems and TAs well or poorly to make it easier or harder for the median student.  That matters, but variance in student overwhelms variance in teaching ability.
  • "Top" universities are generally more focused on r
... (read more)

I mean, testing with a production account is not generally best practice, but it seems to show things are operational.  What aspect of things are you testing?

I (a real human, not a test system) saw the post, upvoted but disagreeed, and made this reply comment. 

2Said Achmiz11d
My ability to post comments!

I think "R&D" is a misleading category - it comprises a LOT of activities with different uncertainty, type, scope, and timeframe of impact.  For tax and reporting purposes, a whole lot of not-very-research-ey software and other engineering is classified as "R&D", though it's more reasonably thought of as "implementation and construction".

Nordquist's "Innovation" measure is very different from economic reporting of R&D spending.  This makes the denominator very questionable in your thesis.

Perhaps more important, returns are NOT uniform... (read more)

I'm not sure the connection between martial arts training/competition and rationalist discussion is all that strong.  Also, I'm not sure if this is meant to apply to "casual discussion in most contexts" or "discussion about rationalist topics among people who share a LOT of context and norms", or "comment threads on LessWrong".

The primary difference I see is that in martial arts, the goal is generally self-improvement, where in rationalist discussions the goal is finding and agreeing on external truths.  Martial arts isn't about disagreement or m... (read more)

I'd love it if tapping out as a safe, no-shame-attached way of leaving a discussion became normal outside of rationalist circles. See footnote 2 as an example. Call that a stretch goal. Primarily, I'm trying to nudge the connotations and etiquette around how rationalists use the concept. I notice I am confused about how you're thinking of the goal of martial arts. "Self-improvement" isn't wrong, but the thing I wanted from it was to go from "get punched"->"flail ineffectually" to "get punched"->"block, hit back, leave." While I was physically in the dojo, yes, I was trying to improve my capabilities, but there was a less abstract goal in mind. Sometimes in a discussion with rationalists, I'm trying to figure out the answer to a specific question I have about the world whose answer matters to me. Other times I think the other person is wrong and they think I'm wrong, and we're trying to figure out what's true because it would change how we act. Often we're mostly just talking because conversation is fun, and then it gets less fun because somebody isn't letting another person gracefully exit or topic switch?  I don't think you should have to use the exact phrase "tapping out." Use what works or has the implications you prefer!

Agreed with the main point of your comment: even mildly-rare events can be distributed in such a way that some of us literally never experience them, and others of us see it so often it appears near-universal.  This is both a true variance in distribution AND a filter effect of what gets highlighted and what downplayed in different social groups.  See also .

For myself, in Seattle (San-Francisco-Lite), I'd only very rarely noticed that someone was trans until the early '00s, when a friend transiti... (read more)

In addition to measurement problems, and definitional problems (is p-hacking "fraud" or just bad methodology?), I think "academia" is too broad to meaningfully answer this question.

Different disciplines, and even different topics within a discipline will have a very different distribution of quality of research, including multiple components - specificity of topic, design of mechanism, data collection, and application of testing methodology.  AND in clarity and transparency, for whether others can easily replicate the results, AND agree or disagree wi... (read more)

Thanks for this - it's an important part of modeling the world and understanding the competitive and cooperative symbiosis of commerce (and generally, human interaction).

I think application of this model requires extending the idea of "monopoly" to include partial substitutability (most non-government-supported monopolies aren't all or nothing, they're hard-to-quantify-but-generally-small differences in desirability). And also some amount of human herding and status-quo bias that makes a temporary advantage much more long-lived if you can make it habitual or accepted standard.  

I mean, there are some parallels between any two topics.  Whether those parallels are important, and whether they help model either thing varies pretty widely.

In this case, I don't see many useful parallels.  The difference between individual small-scale rights and power to harm a very few individuals being demonstrably real for guns, vesus the somewhat theoretical future large-scale degradation or destruction of civilization makes it just completely a different dimension of disagreement.

One parallel MIGHT be the general distrust of government restriction on private activity, but from people I've talked with on both topics, that's present but not controlling for beliefs about these topics.

upvoted for interesting ideas and personal experience on the topic.  If I could strong-disagree, I would.  I do not recommend this to anyone.

Mostly my reasoning is "not safe".  You're correct that historically, the IRS doesn't come at small non-payers very hard.  You're incorrect to extend that to "never" or to "that won't change without warning due to technology, or legal/political environment".  You're also correct that, at current interest rates, it's about double at ten years.  You're incorrect, though, to think that's the... (read more)

3David Gross18d
I believe it's not actually true that, if you merely repeatedly neglect to pay your taxes, the I.R.S. will inquire into your motives and intent in order to decide whether to come after you with both barrels blazing. As far as I can tell they do not have the resources or inclination to do that sort of investigation. I base this largely on the experience of American war tax resisters. They are often loudly self-incriminating about their willful intent: sometimes going so far as to write letters to the I.R.S. explaining their motivation. Of the tens of thousands of Americans who have engaged in war tax resistance over the years, I know of only two in the past 80 years who have been criminally prosecuted merely for willful refusal to pay taxes (there have been others who have been criminally prosecuted or jailed for things like filing inaccurate forms or contempt of court, but those were cases in which they were defying the law in ways that went beyond merely not paying). The war tax resistance movement keeps pretty good records on its "martyrs" so if there were other cases like those two they would probably have come to my attention.

It gets tried every so often, but there are HUGE differences between companies and geographical/political governance.   

The primary difference, in my mind, is filtering and voluntary association.  People choose where to work, and companies choose who works for them, independently (mostly) of where they live, what kind of lifestyle they like, whether they have children or relatives nearby, etc.  Cities and countries can sometimes turn away some immigrants, but they universally accept children born there and they can't fire citizens who aren't productive.

Umm, I think you're putting too much weight on idiomatic shorthand that's evolved for communicating some common things very easily, and less-common ideas less easily.  "Garfield is a cat" is a very reasonable and common thing to try to communicate - a specific not-well-known thing (garfield) being described in terms of a nearly-universal knowledge ("cat").  The reverse might be "Cats are things like Garfield", which is a bit odd because the necessity of communicating it is a bit odd.

It tends to track specific to general, not because they're specific or general concepts, but because specifics more commonly need to be described than generalities.  


1Bill Benzon23d
I don't think there's anything particularly idiomatic about "is a," but that's a side issue. What's at issue are the underlying linguistic mechanisms. How do they work? Sure, some communicative tasks may be more common than others, and that is something to take into account. Linguistic mechanisms that are used frequently tend to be more compact than those used less frequently, for obvious reasons. Regardless of frequency, how do they work?

If you think evolution has a utility function, and that it's the SAME function that an agent formed by an evolutionary process has, you're not likely to get me to follow you down any experimental or reasoning path.  And if you think this utility function is "perfectly selfish", you've got EVEN MORE work cut out in defining terms, because those just don't mean what I think you want them to.

Empathy as a heuristic to enable cooperation is easy to understand, but when normatively modeling things, you have to deconstruct the heuristics to actual goals and strategies.

Take a step back and try rereading what I wrote in a charitable light, because it appears you have completely misconstrued what I was saying. A major part of the "cooperation" involved here is in being able to cooperate with yourself.  In an environment with a well-mixed group of bots each employing differing strategies, and some kind of reproductive rule (if you have 100 utility, say, spawn a copy of yourself), Cooperate-bots are unlikely to be terribly prolific; they lose out against many other bots. In such an environment, a strategem of defecting against bots that defect against cooperate-bot is a -cheap- mechanism of coordination; you can coordinate with other "Selfish Altruist" bots, and cooperate with them, but you don't take a whole lot of hits from failing to edit: defect against cooperate-bot.  Additionally, you're unlikely to run up against very many bots that cooperate with cooperate-bot, but defect against you.  As a coordination strategy, it is therefore inexpensive. And if "computation time" is considered as an expense against utility, which I think reasonably should be the case, you're doing a relatively good job minimizing this; you have to perform exactly one prediction of what another bot will do.  I did mention this was a factor.

I think you're using the wrong model for what "have a purpose" means.  purpose isn't an attribute of a thing.  Purpose is a relation between an agent and a thing.  An agent infers (or creates) a purpose for things (including themselves).  This purpose-for-me is temporary, mutable, and relative.  Different agents may have different (or no) purposes for the same thing.

[epistemic status: mostly priors about fantastic quantities being bullshit.  no clue what evidence would update me in any direction. ]

I don't believe the universe is infinite.  It has a beginning, an end, and a finite (but large and perhaps growing) extent.  I further do not believe the term "exist" can apply to other universes.  

I see.  So the experiment is to see if you can find a frequency that is comfortable/helpful, and then figure out if it's likely to match your alpha waves?  From what I can tell, alpha waves are typically between 8 and 12 Hz, but I don't know if it varies over time (nor how quickly) for individuals.  

Unfortunately, the linked paper notes that the pulse is timed with the "trough" of the alpha wave, which is unlikely to be found with at-home experimentation.  That implies that it'd need to use an EEG to synchronize, rather than ANY fixed frequency.

I think maybe you can stare for a while until it syncs up, but i don't have an EEG. I also tried restarting the strobe a bunch to get on a different phase but didn't notice any difference

Do you have a hypothesis you're collecting data for, or is this just fun for you?  I'm a little put off by the imperative in the title, without justification in the post.

Yes the hypothesis is that if you flash a light in sync with your alpha brain waves then focusing and learning is easier

For some screen size/shape, for some browser positioning, for some readers, this is probably true. It's fucking stupid to believe that's anywhere close to a majority.   If that's YOUR reading area, why not just make your browser that size? 

It should be pretty easy to write a tampermonkey or browser extension to make it work that way.  Now that you point it out, I'm kind of surprised this doesn't seem to exist.  

I admit that 30-50% is arbitrary and shouldn't be brought up like a fact, I have removed it. (I didn't mean to have such a strong tone there, but I did) What I really want to say is that the default location for the target text to be somewhere closer to the middle/wherever most people usually put their eyes on. (Perhaps exactly the height where you clicked the in-page redirect?) I still stand by that it should not be exactly at the top for ease of reading (I hope this doesn't sound too motte-and-bailey). The reason that it is redirected to the top is probably because it is a very objective location and wouldn't get affected by device size. But it is very much not a standard location where the current line of text you are reading will be. I am willing to bet that <3% of people read articles where they scroll their currently reading line up to the top three visible lines.

The VAST majority of matter and energy in the universe is in the non-purpose category - it often has activity and reaction, and effects over time, but it doesn't strategically change it's mechanisms in order to achieve something, it just executes.

Humans (and arguably other animals and groups distinct from indiiduals) may have purpose, and may infer purpose on things that don't have it intrinsically.  Even then, there are usually multiple simultaneous purposes (and non-purpose mechanisms) that interact, sometimes amplifying, sometimes dampening one another. 

1Andreas Chrysopoulos24d
Locally maybe there is no purpose. But maybe it's necessary for life to emerge elsewhere, so it could have a larger purpose. If you isolate a napkin, it has no purpose but as soon as you need to wipe you mouth it acquires one. So maybe purpose is relative. But yeah, looking at my original post, I'm trying to compare the purpose of the universe with the purpose of humans, which doesn't necessarily overlap 

I think you're using a different sense of the word "possible".  In a simplified physics model, where mass and energy are easily transformed as needed, you can just wave your hands and say "there's plenty of mass to use for computronium".  That's not the same as saying "there is an achievable causal path from what we experience now to the world described".

It's also assuming:

  1. We know roughly how to achieve immortality
  2. We can do that exactly in the window of "the last possible moment" of AGI.
  3. Efforts between immortality and AGI are fungible and exclusive, or at least related in some way.
  4. Ok, yeah - we have to succeed on BOTH alignment and immortality to keep any of us from dying. 

3 and 4 are, I think, the point of the post.  To the extent that work on immortality rather than alignment, we narrow the window of #2, and risk getting neither.

Isn't the assumption that once we successfully align AGI, it can do the work on immortality? So "we" don't need to know how beyond that.

Honestly, I haven’t seen much about individual biological immortality, or even significant life-extension, in the last few years.

I suspect progress on computational consciousness-like mechanisms has fully eclipsed the idea that biological brains in the current iteration are the way of the future. And there’s been roughly no progress on upload, so the topic of immortality for currently-existing humans has mostly fallen away.

Also, if/when AI is vastly more effective than biological intelligence, it takes a lot of the ego-drive away for the losers.

Note that in adversarial (or potentially adversarial) situations, error is not independent and identically distributed.  If your acceptance spec for gold coins is "25g +/- 0.5g", you should expect your suppliers to mostly give you coins near 24.5g.  Network errors are also correlated, either because they ARE an attack, or because some specific component or configuration is causing it.  

Hmm.  I've not seen any research about that possibility, which is obvious enough that I'd expect to see it if it were actually promising.  And naively, it's not clear that you'd get more powerful results from using 1M times the compute this way, compared to more direct scaling.

I'd put that in the exact same bucket as "not known if it's even possible".

Such possibility is explored at least here: but that's not the point. The point is: even in hypothetical world where scaling laws and algorithmic progress hit the wall at smartest-human-level, you can do this and get an arbitrary level of intelligence. In real world, of course, there are better ways.

An important sub-topic within "open source vs regulatory capture" is "there does not exist an authority that can legibly and correctly regulate AI".  

Note that the outlook from MIRI folks appears to somewhat agree with this, that there does not exist an authority that can legibly and correctly regulate AI, except by stopping it entirely.

I always like seeing interesting ideas, but this one doesn't resonate much for me.  I have two concerns:

  1. Does it actually make the site better?  Can you point out a few posts that would be promoted under this scheme, but the mods didn't actually promote without it?  My naive belief is that the mods are pretty good at picking what to promote, and if they miss one all it would take is an IM to get them to consider it.
  2. Does it improve things to add money to the curation process (or to turn karma into currency which can be spent)?  My current belief is that it does not - it just makes things game-able.
I think mods promote posts quite rarely other than via the mechanism of deciding if posts should be frontpage or not right? Fair enough on "just IM-ing the mods is enough". I'm not sure what I think about this. Your concerns seem reasonable to me. I probably won't bother trying to find examples where I'm not biased, but I think there are some.

I ... don't think that line of thinking almost ever applies to me.  If the topic interests me and/or there's something about the post that piques my desire to discuss, it almost always turns out that there are others with similar willingness.  At the very least, the OP usually engages to some extent.

There are very few, and perhaps zero, cases where crafting or even evaluating an existing contract is less effort than just reading and responding AND I see enough potential to expend the contract effort but not the read/reply effort. 

In addition... (read more)

A lot depends on whether this is a high-bandwidth discussion/debate, or an anonymous post/read of public statements (or, on messages boards, somewhere in between).  In the interactive case, Alice and Bob could focus on cruxes and specific points of agreement/disagreement.  In the public/semi-public case, it's rare that either side puts that much effort in.

I'll also note that a lot of topics on which such disagreements persist are massively multidimensional and hard to quantify degree of closeness, so "agreement" is very hard to define.  No t... (read more)

I'm talking specifically about discussions on LW. Of course in reality Alice ignores Bob's comment 90% of the time, and that's a problem in it's own right. It would be ideal if people who have distinct information would choose to exchange that information. I picked a specific and reasonably grounded topic, "x-risk", or "the probability that we all die in the next 10 years", which is one number, so not hard to compare, unless you want to break it down by cause of death. In contrived philosophical discussions, it can certainly be hard to determine who agrees on what, but I have a hunch that this is the least of the problems in those discussions. A lot of things have zero practical impact, and that's also a problem in it's own right. It seems to me that we're barely ever having "is working on this problem going to have practical impact?" type of discussions.
Answer by DagonOct 25, 20230-1

I mean, the universal dispute resolution is violence, or the threat thereof.  Typically this is encapsulated in governments, courts, and authorities, in order to make an escalation path that rarely comes down to actual violence. 

For low-value wagers/markets, a less powerful authority generally suffices - a company or even individual running the market/site.  The predictions can be written such that it's unlikely to be disputed, and to specify a dispute-resolution mechanism, but in the end the enforcement is by whoever is holding the money. &... (read more)

Yup, like so many thought experiments, it's intended to restrict all the real-world options in order to focus on the intuition conflict between "once" and "commonly".  One of the reasons I'm not a Utilitarian is that I don't think most values are anywhere near linear, and simple scaling (shut up and multiply) just doesn't resonate with me.

If the "hero for hire" is a lifeguard or swimming instructor, we have LOTS of examples of communities or occasionally rich individuals deciding to provide that.  The difference that the thought experiment fails to make clear is one of timeframe and (as you point out) uniqueness of YOUR ability to help.

Upvoted, and thanks for writing this.  I disagree on multiple dimensions - on the object level, I don't think ANY research topic can be stopped for very long, and I don't think AI specifically gets much safer with any achievable finite pause, compared to a slowdown and standard of care for roughly the same duration.  On the strategy level, I wonder what other topics you'd use as support for your thesis (if you feel extreme measures are correct, advocate for them).  US Gun Control?  Drug legalization or enforcement?  Private capital... (read more)

Load More