This can be modeled as a conversation with readers, where the reader prompts the writer to taking the next step on the list.
Claim ought to be supported with reasons. Reasons ought to be based on evidence. Arguments are recursive: a part of an argument is an acknowledgment of an anticipated response, and another argument addresses that response. Finally, when the distance between a claim and a reason grows large, we draw connections with something called warrants.
The logic of warrants proceeds in generalities and instances. A general circumstance predictably leads to a general consequence, and if you have an instance of the circumstance you can infer an instance of the consequence.
Arguing in real life papers is complexified from the 5 steps, because
Claims should be supported by two or more reasons
A writer can anticipate and address numerous responses.
As I mentioned, arguments are recursive, especi
Getting the easy things right shows respect for your readers and is the best training for dealing with the hard things.
If they don't believe the evidence, they'll reject the reasons and, with them, your claim.
We saw previously that claims ought to be supported with reasons, and reasons ought to be based on evidence. Now we will look closer at reasons and evidence.
Reasons must be in a clear, logical order. Atomically, readers need to buy each of your reasons, but compositionally they need to buy your logic. Storyboarding is a useful technique for arranging reasons into a logical order: physical arrangements of index cards, or some DAG-like syntax. Here, you can list evidence you have for each reason or, if you're speculating, list the kind of evidence you would need.
When storyboarding, you want to read out the top level reasons as a composite entity without looking at the details (evidence), because you want to make sure the high-level logic makes sense.
Readers will not accept a reason until they see it anchored in what they consider to be a bedrock of established fact. ... To count as evidence, a statement must report something tha
Primary sources provide you with the "raw data" or evidence you will use to develop, test, and ultimately justify your hypothesis or claim.
Secondary sources are books, articles, or reports that are based on primary sources and are intended for scholarly or professional audiences.
Tertiary sources are books and articles that synthesize and report on secondary sources for general readers, such as textbooks, articles in encyclopedias, and articles in mass-circulation publications.
The distinction between primary and secondary sources comes from 19th century historians, and the idea of tertiary sources came later. The boundaries can be fuzzy, and are certainly dependent on the task at hand.
I want to reason about what these distinctions look like in the alignment community, and whether or not they're important.
The rest of chapter five is about how to use libraries and information technologies, and evaluating sources for relevance and reliability.
Chapter 6 starts off with the kind of thing you should be looking for while you read
Yesterday I quit my job for direct work on epistemic public goods! Day one of direct work trial offer is April 4th, and it'll take 6 weeks after that to know if I'm a fulltime hire.
I'm turning down
raise to 200k/yr usd
building lots of skills and career capital that would give me immense job security in worlds where investment into one particular blockchain doesn't go entirely to zero
having fun on the technical challenges
for
confluence of my skillset and a theory of change that could pay huge dividends in the epistemic public goods space
I think a property of my theory of change is that academic and commercial speed is a bottleneck. I recently realized that my mass assignment for timelines synchronized with my mass assignment for the prosaic/nonprosaic axis. The basic idea is that let's say a radical new paper that blows up and supplants the entire optimization literature gets pushed to the arxiv tomorrow, signaling the start of some paradigm that we would call nonprosaic. The lag time for academics and industry to figure out what's going on, fi... (read more)
The reasoning assumes that ideas are first generated in academia and don't arise
inside of companies. With DeepMind outperforming the academic protein folding
community when protein folding isn't even the main focus of DeepMind I consider
it plausible that new approaches arise within a company and get only released
publically when they are strong enough to have an effect.
Even if there's a paper most radical new papers get ignored by most people and
it might be that in the beginning only one company takes the idea seriously and
doesn't talk about it publically to keep a competive edge.
1Quinn2y
That's totally fair, but I have a wild guess that the pipeline from google brain
to google products is pretty nontrivial to traverse, and not wholly unlike the
pipeline from arxiv to product.
2Steven Byrnes2y
How short is "short" for you?
Like, AlexNet was 2012, DeepMind patented deep Q learning in 2014, the first
TensorFlow release was 2015, the first PyTorch release was 2016, the first TPU
was 2016, and by 2019 we had billion-parameter GPT-2 …
So if you say "Short is ≤2 years", then yeah, I agree. If you say "Short is ≤8
years", I think I'd disagree, I think 8 years might be plenty for a non-prosaic
approach. (I think there are a lot of people for whom AGI in 15-20 years still
counts as "short timelines". Depends on who you're talking to, I guess.)
1Quinn2y
I should've mentioned in OP but I was lowkey thinking upper bound on "short"
would be 10 years.
I think developer ecosystems are incredibly slow (longer than ten years for a
new PL to gain penetration, for instance). I guess under a singleton "one
company drives TAI on its own" scenario this doesn't matter, because tooling
tailored for a few teams internal to the same company is enough which can move
faster than a proper developer ecosystem. But under a CAIS-like scenario there
would need to be a mature developer ecosystem, so that there could be
competition.
I feel like 7 years from AlexNet to the world of PyTorch, TPUs, tons of ML MOOCs, billion-parameter models, etc. is strong evidence against what you're saying, right? Or were deep neural nets already a big and hot and active ecosystem even before AlexNet, more than I realize? (I wasn't paying attention at the time.)
Moreover, even if not all the infrastructure of deep neural nets transfers to a new family of ML algorithms, much of it will. For example, the building up of people and money in ML, the building up of GPU / ASIC servers and the tools to use them, the normalization of the idea that it’s reasonable to invest millions of dollars to train one model and to fab ASICs tailored to a particular ML algorithm, the proliferation of expertise related to parallelization and hardware-acceleration, etc. So if it took 7 years from AlexNet to smooth turnkey industrial-scale deep neural nets and billion-parameter models and zillions of people trained to use them, then I think we can guess <7 years to get from a different family of learning algorithms to the analogous situation. Right? Or where do you disagree?
No you're right. I think I'm updating toward thinking there's a region of
nonprosaic short-timelines universes. Overall it still seems like that region is
relatively much smaller than prosaic short-timelines and nonprosaic
long-timelines, though.
I asked a friend whether I should TA for a codeschool called ${{codeschool}}.
You shouldn't hang around ${{codeschool}}. People at ${{codeschool}} are not pursuing excellence.
A hidden claim there that I would soak up the pursuit of non-excellence by proximity or osmosis isn't what's interesting (though I could see that turning out either way). What's interesting is the value of non-excellence, which I'll call adequacy.
${{codeschool}} in this case is effective and impactful at putting butts in seats at companies, and is thereby re... (read more)
Seems to me that on the market there are very few jobs for the SICP types.
The more meta something is, the less of that is needed. If you can design an
interactive website, there are thousands of job opportunities for you, because
thousands of companies want an interactive website, and somehow they are willing
to pay for reinventing the wheel. If you can design a new programming language
and write a compiler for it... well, it seems that world already has too many
different programming languages, but sure there is a place for maybe a dozen
more. The probability of success is very small even if you are a genius.
The best opportunity for developers who think too meta is probably to design a
new library for an already popular programming language, and hope it becomes
popular. The question is how exactly you plan to get paid for that.
Probably another problem is that it requires intelligence to recognize
intelligence, and it requires expertise to recognize expertise. The SICP type
developer seems to most potential employers and most potential colleagues as...
just another developer. The company does not see individual output, only team
output; it does not matter that your part of code does not contain bugs, if the
project as a whole does. You cannot use solutions that are too abstract for your
colleagues, or for your managers. Companies value replaceability, because it is
less fragile and helps to keep developer salaries lower than they might be
otherwise. (In theory, you could have a team full of SICP type developers, which
would allow them to work smarter, and yet the company would feel safe. In
practice, companies can't recognize this type and don't appreciate it, so this
is not going to happen.)
Again, probably the best position for a SICP type developer in a company would
be to develop some library that the rest of the company would use. That is, a
subproject of a limited size that the developer can do alone, so they are not
limited in the techniques they use,
Let FairBot be the player that sends an opponent to Cooperate (C) if it is provable that they cooperate with FairBot, and sends them to Defect (D) otherwise.
Let FairBot_k be the player that searches for proofs of length <= k that it's input cooperates with FairBot_k, and cooperates if it finds one, returning defect if all the proofs of length <= k are exhausted without one being valid.
Critch writes that "100%" of the time, mathematicians and computer scientists report believing that FairBot_k(FairBot_k) = D, owing to the basic vision of a stack overf... (read more)
It is almost certainly true that setting k=1, Fairbot_1 defects against
Fairbot_1 because there are no proofs of cooperation that are 1 bit in length.
There can be exceptions: for instance, where Fairbot_1(Fairbot_1) = C is
actually an axiom, and represented with a 1-bit string.
It is definitely not true that Fairbot_k cooperates with Fairbot_k for all k and
all implementations of Fairbot_k, with or without Löb's theorem. It is also
definitely not true that Fairbot_k defects against Fairbot_k in general. Whether
they cooperate or defect depends upon exactly what proof system and encoding
they are using.
3Jalex Stark2mo
I think that to get the type of the agent, you need to apply a fixpoint
operator. This also happens inside the proof of Löb for constructing a certain
self-referential sentence.
(As a breadcrumb, I've heard that this is related to the Y combinator.)
I find myself, just as a random guy, deeply impressed at the operational competence of airports and hospitals. Any good books about that sort of thing?
It is pretty impressive that they function as well as they do, but seeing how
the sausage is made (at least in hospitals) does detract from it quite
substantially. You get to see not only how an enormous number of battle hardened
processes prevent a lot of lethal screw-ups, but also how also how sometimes the
very same processes cause serious and very occasionally lethal screw-ups.
It doesn't help that hospitals seem to be universally run with about 90% of the
resources they need to function reasonably effectively. This is possibly because
there is relentless pressure to cut costs, but if you strip any more out of them
then people start to die from obviously preventable failures. So it stabilizes
at a point where everything is much more horrible than it could be, but not
quite to an obviously lethal extent.
As far as your direct question goes, I don't have any good books to recommend.
Rats and EAs should help with the sanity levels in other communities
Consider politics. You should take your political preferences/aesthetics, go to the tribes that are based on them, and help them be more sane. In the politics example, everyone's favorite tribe has failure modes, and it is sort of the responsibility of the clearest-headed members of that tribe to make sure that those failure modes don't become the dominant force of that tribe.
Speaking for myself, having been deeply in an activist tribe before I was a rat/EA, I regret I wasn't there to hel... (read more)
But what if that makes my tribe lose the political battle?
I mean, if rationality actually helped win political fights, by the power of
evolution we already would have been all born rational...
2Pattern1y
1. Evolution does not magically get from A to B instantly.
2. Evolution does not necessarily care about X for many values of X.
This can include: winning political fights, whether or not nukes are built and
many other things.
This may be context-dependent. Different countries probably have different
cultural norms. Norms may differ for higher-status and lower-status speakers.
Humble speech may impress some people, but others may perceive it as a sign of
weakness. Also, is your audience fellow scientists or are you writing a popular
science book? (More hedging for the former, less hedging for the latter.)
notes (from a very jr researcher) on alignment training pipeline
Training for alignment research is one part competence (at math, cs, philosophy) and another part having an inside view / gears-level model of the actual problem. Competence can be outsourced to universities and independent study, but inside view / gears-level model of the actual problem requires community support.
A background assumption I'm working with is that training as a longtermist is not always synchronized with legible-to-academia training. It might be the case that jr researchers oug... (read more)
I don't think Critch's saying that the best way to get his attention is through
cold emails backed up by credentials. The whole post is about him not using that
as a filter to decide who's worth his time but that people should create good
technical writing to get attention.
1philip_b2y
Critch's written somewhere that if you can get into UC Berkeley, he'll
automatically allow you to become his student, because getting into UC Berkeley
is a good enough filter.
2ChristianKl2y
Where did he say that? Given that he's working at UC Berkeley I would expect him
to treat UC Berkeley students preferentially for reasons that aren't just about
UC Berkeley being able to filter.
It's natural that you can sign up for one of the classes he teaches at UC
Berkeley by being a student of UC Berkeley.
Being enrolled into MIT might be just as hard as being enrolled into UC Berkeley
but it doesn't give you the same access to courses taught at UC Berkeley by it's
faculty.
2philip_b2y
http://acritch.com/ai-berkeley/ [http://acritch.com/ai-berkeley/]
and also
2ChristianKl2y
Okay, he does speak about using Berkeley as a filter but he doesn't speak about
taking people as his student.
It seems about helping people in UC Berkeley to connect with other people in UC
Berkeley.
Methods, famously, includes the line "I am a descendant of the line of Bacon", tracing empiricism to either Roger (13th century) or Francis (16th century) (unclear which).
Though a cursory wikiing shows an 11th century figure providing precedents for empiricism! Alhazen or Ibn al-Haytham worked mostly optics apparently but had some meta-level writings about the scientific method itself. I found this shockingly excellent quote
The duty of the man who investigates the writings of scientists, if learning the truth is his goal, is to make himself an enemy of a
New discord server dedicated to multi-multi delegation research
DM me for invite if you're at all interested in multipolar scenarios, cooperative AI, ARCHES, social applications & governance, computational social choice, heterogeneous takeoff, etc.
(side note I'm also working on figuring out what unipolar worlds and/or homogeneous takeoff worlds imply for MMD research).
Last time we discussed the difference between information and a question or a problem, and I suggested that the novelty-satisfied mode of information presentation isn't as good as addressing actual questions or problems. In chapter 3 which I have not typed up thoughts about, A three step procedure is introduced
Topic: "I am studying ..."
Question: "... because I want to find out what/why/how ..."
Significance: "... to help my reader understand ..."
As we elaborate on the different k
We can say "a monotonic map, Φ∈mono(QP) is a phenomenon of P as observed by Q", then, emergence is simply the impreservation of joins.
Given preorders (P,≤P) and (Q,≤Q), we say a map in mono(QP) "preserves" joins (which, recall, are least upper bounds) iff ∀ab∈P,Φa∨QΦb=Φ(a∨Pb) where by "x=y" we mean x≤y∧y≤x.
Suppose Φ is a measurement taken from a particle. We would like for our measurement system to be robust against emergence, which is literally operationalized by measuring one particle, measuring another, t... (read more)
Two premises of mine are that I'm more ambitious than nearly everyone I meet in meatspace and normal distributions. This implies that in any relationship, I should expect to be the more ambitious one.
I do aspire to be a nagging voice increasing the ambitions of all my friends. I literally break the ice with acquaintances by asking "how's your master plan going?" because I try to create vibes like we're having coffee in the hallway of a supervillain conference, and I like to also ask "what harder project is your current project a war... (read more)
I'm not aware of a literature or a dialogue on what I think is a very crucial divide in longtermism.
In this shortform, I'm going to take a polarity approach. I'm going to bring each pole to it's extreme, probably each beyond positions that are actually held, because I think median longtermism or the longtermism described in the Precipice is a kind of average of the two.
Negative longtermism is saying "let's not let some bad stuff happen", namely extinction. It wants to preserve. If nothing gets better for the poor or the an... (read more)
Would there be a way of estimating how many people within the amazon organization are fanatical about same day delivery ratio against how many are "just working a job"? Does anyone have a guess? My guess is that an organization of that size with a lot of cash only needs about 50 true fanatics, the rest can be "mere employees". What do yall think?
I can't really think of any research bearing on this, and unclear how you'd
measure it anyway.
One way to go might be to note that there is a wide (and weird) variance between
the efficiency of companies: market pressures are slack enough that two
companies doing as far as can be told the exact same thing in the same
geographic markets with the same inputs might be almost 100% different (I think
was the range in the example of concrete manufacturing in one paper I read); a
lot of that difference appears to be explainable by the quality of the
management, and you can do randomized experiments in management coaching or
intensity of management and see substantial changes in the efficiency of a
company [https://www.gwern.net/notes/Competence#bloom-et-al-2012] (Bloom - the
other one - has a bunch of studies like this). Presumably you could try to
extrapolate from the effects of individuals to company-wide effects, and define
the goal of the 'fanatical' as something like 'maintaining top-10% industry-wide
performance': if educating the CEO is worth X percentiles and hiring a good
manager is worth 0.0Y percentiles and you have such and such a number of each,
then multiply out to figure out what will bump you 40 percentiles from an
imagined baseline of 50% to the 90% goal.
Another argument might be a more Fermi estimate style argument from startups. A
good startup CEO should be a fanatic about something, otherwise they probably
aren't going to survive the job. So we can assume one fanatic at least. People
generally talk about startups beginning to lose the special startup magic of
agility, focus, and fanaticism at around Dunbar's number level of employees like
300, or even less (eg Amazon's two-pizza rule which is I guess 6 people?). In
the 'worst' case that the founder has hired 0 fanatics, that implies 1 fanatic
can ride herd over no more than ~300 people; in the 'best' case that he's hired
dozens, then each fanatic can only cover for more like 2 or 3 non-fanatics. I'm
2Dagon1y
I'm not sure "fanatical" is well-defined enough to mean anything here. I doubt
there are any who'd commit terrorist acts to further same-day delivery. There
are probably quite a few who believe it's important to the business, and a big
benefit for many customers.
You're absolutely right that a lot of employees and contractors can be "mere
employees", not particularly caring about long-term strategy, customer
perception, or the like. That's kind of the nature of ALL organizations and
group behaviors, including corporate, government, and social groupings. There's
generally some amount of influencers/selectors/visionaries, some amount of
strategists and implementers, and a large number of followers. Most
organizations are multidimensional enough that the same people can play
different roles on different topics as well.
1JBlack1y
I don't think it needs any true fanatics. It just needs incentives.
This isn't to say there won't be fanatics anyway. There probably aren't many
things that nobody can get fanatical about. This is even more true if they're
given incentives to act fanatical about it.
3rhollerith_dot_com1y
Sure, but the incentive structure needs continual maintenance to keep it aligned
with or pointing at the goal, which naturally leads to the questions of how many
people are needed to keep the structure pointing at the goal, and what the
motivation of those people will be.
We need a name for the following heuristic, I think, I think of it as one of those "tribal knowledge" things that gets passed on like an oral tradition without being citeable in the sense of being a part of a literature. If you come up with a name I'll certainly credit you in a top level post!
I heard it from Abram Demski at AISU'21.
Suppose you're either going to end up in world A or world B, and you're uncertain about which one it's going to be. Suppose you can pull lever LA which will be 100 valuable if you end up in world A, or you can pull lever LB whi... (read more)
Why are you specifying 100 or 0 value, and using fuzzy language like "acceptably
small" for disvalue?
Is this based on "value" and "disvalue" being different dimensions, and thus
incomparable? Wouldn't you just include both in your prediction, and run it
through your (best guess of) utility function and pick highest expectation,
weighted by your probability estimate of which universe you'll find yourself in?
1TLW1y
100 and 0 in this context make sense. Or at least in my initial reading:
arbitrarily-chosen values that are in a decent range to work quickly with (akin
to why people often work in percentages instead of 0..1)
It is - I'm going to say "often", although I am aware this is suboptimal
phrasing - often the case that you are confident in the sign of an outcome but
not the magnitude of the outcome.
As such, you can often end up with discontinuities at zero.
Dropping the entire probability distribution of outcomes through your utility
function doesn't even necessarily have a closed-form result. In a universe where
computation itself is a cost, finding a cheaper heuristic (and working through
if said heuristic has any particular basis or problems) can be valuable.
The heuristic in the grandparent comment is just what happens if you are
simultaneously very confident in the sign of positive results, and have very
little confidence in the magnitude of negative results.
1TLW1y
It is often the case that you are confident in the sign of an outcome but not
the magnitude of the outcome.
This heuristic is what happens if you are simultaneously very confident in the
sign of positive results, and have very little confidence in the magnitude of
negative results.
1Measure1y
I'm not sure I understand. If the lever is +100 in world A and -90 in world B,
it seems like a good bet if you don't know which world you're in. Or is that
what you mean by "acceptably small amount of disvalue"?
1Quinn1y
Obviously there are considerations downstream of articulating this, one is that
when P(A)>P(B) but V(LA|A)<V(LB|B) so it's reasonable to hedge on ending up in
world B even though it's not strictly more probable than ending up in world A.
I think one of the most crucial meta skills i've developed is honing my sense of who's criticizing me vs. who's complaining.
A criticism is actionable, implicitly often it's from someone who wants you to win. A complaint is when you can't figure out how you'd actionably fix something or improve based on what you're being told.
This simple binary story is problematic. It can empower you to ignore criticism you don't like by providing a set of excuses, if you're not careful. Sometimes it's operationally impossible to parse out a critic... (read more)
Anchoring effect is enough for a Schelling point, it doesn't have to be simple
solution.
For instance a new nation that wants to move away from dictatorship is
automatically going to build a democracy with multiple independent arms
(legislature, judiciary, executive), a constitution, periodic elections of
representatives, etc.
They could choose to try a direct democracy or change the term from 5 years to 1
year, they could choose to have public elections for the judiciary too, or any
other deviation from how democracies usually run, but they won't. Fear of the
unknown + no creativity or motivation will be sufficient from them to copy
existing countries' democratic structure.
Disvalue via interpersonal expected value and probability
My deontologist friend just told me that treating people like investments is no way to live. The benefits of living by that take are that your commitments are more binding, you actually do factor out uncertainty, because when you treat people like investments you always think "well someday I'll no longer be creating value for this person and they'll drop me from their life". It's hard to make long term plans, living like that.
I've kept friends around out of loyalty to what we shared 5-10 years ago w... (read more)
One thing to be careful about in such decisions - you don't know your own
utility function very precisely, and your modeling of both future interactions
and your value from such are EXTREMELY lossy.
The best argument for deontological approaches is that you're running on very
corrupt hardware, and rules that have evolved and been tested over a long period
of time are far more trustworthy than your ad-hoc analysis which privileges
obvious visible artifacts over more subtle (but often more important)
considerations.
1[anonymous]1y
Imo choosing to disconnect from people who are no longer providing any value to
you is just a healthy thing to do, even a deontologist should agree with that.
I may refine this into a formal bounty at some point.
I'm curious if censorship would actually work in the context of blocking deployment of superpowerful AI systems. Sometimes people will mention "matrix multiplication" as a sort of goofy edge case, which isn't very plausible, but that doesn't mean there couldn't be actual political pressure to censor it. A more plausible example would be attention. Say the government threatens soft power against arxiv if they don't pull attention is all you need, or threatens soft power against harvard if their linguistic... (read more)
You'll need a govt body full of people who are aligned in their thinking, no one
should defect.
Also Yudkowsky's response to this would prolly be that it isn't enough to censor
the first time the idea is created, someone else will just discover another (or
the same) path to AGI independently. See pivotal act
[https://arbital.com/p/pivotal/].
any literature on estimates of social impact of businesses divided by their valuations?
the idea that dollars are a proxy for social impact is neat, but leaves a lot of room for goodhart and I think it's plausible that they diverge entirely in cases. It would be useful to know, if possible to know, what's going on here.
there's paid tools that estimate this, probably poorly
1Quinn1y
thinking about this comment
[https://www.lesswrong.com/posts/kuDKtwwbsksAW4BG2/zvi-s-thoughts-on-the-survival-and-flourishing-fund-sff?commentId=aCu7tC6LAqRiyACgv]
Why have I heard about Tyson investing into lab grown, but I haven't heard about big oil investing in renewable?
Tyson's basic insight here is not to identify as "an animal agriculture company". Instead, they identify as "a feeding people company". (Which happens to align with doing the right thing, conveniently!)
It seems like big oil is making a tremendous mistake here. Do you think oil execs go around saying "we're an oil company"? When they could instead be going around saying "we're a powering stuff" company. Being a powering stuff company means you hav... (read more)
Yes, this is more about you not hearing about it.
SHELL HAS A BIGGER CLEAN ENERGY PLAN THAN YOU THINK — CLEANTECHNICA INTERVIEW
[https://cleantechnica.com/2020/05/01/shell-has-a-bigger-clean-energy-plan-than-you-think-cleantechnica-interview/]
BP BETS FUTURE ON GREEN ENERGY, BUT INVESTORS REMAIN WARY
[https://www.wsj.com/articles/bp-bets-future-on-green-energy-but-investors-remain-wary-11601402304]
It seems that Tyson invested 150 million
[https://futurism.com/lab-grown-meat-tyson-is-making-a-massive-investment-in-a-meatless-future]into
a fund for new food solutions.
In contrast to that Exxon invested 600 million
[https://www.scientificamerican.com/article/biofuels-algae-exxon-venter/]in
algae biofuels back in 2009 and more afterward.
2Yoav Ravid1y
I do vaguely remember hearing of big oil doing that, though perhaps not as much
as meat producers do with lab grown meat, try looking into it.
2Pattern1y
1. Might be a little bit harder in that industry.
2. Are they in charge (of that)? Who chose them?
1Quinn1y
you're most likely right about it being harder in the industry!
I don't think they need permission or an external mandate to do the right thing!
1JBlack1y
The main problem is that prior investment into the oil method of powering stuff
doesn't translate into having a comparative advantage in a renewable way of
powering stuff. They want a return on their existing massive investments.
While this looks superficially like a sunk cost fallacy, it isn't. If a
comparatively small investment (mere billions) can ensure continued returns on
their trillions of sunk capital for another decade, it's worth it to them.
Investment into renewable powering stuff would require substantially different
skill sets in employees, in very different locations, and highly non-overlapping
investment. At best, such an endeavour would constitute a wholly owned
subsidiary that grows while the rest of the company withers. At worst, a
parasite that hastens the demise of the parent while eventually failing in the
face of competition anyway.
I've had a background assumption in my interpretation of and beliefs about reward functions for as long as I can remember (i.e. since first reading the sequences), that I suddenly realized I don't believe is written down. Over the last two years I've gained experience writing coq sufficient to inspire a convenient way of framing it.
Computational vs axiomatic reward functions
Computational vs axiomatic in proof engineering
A proof engineer calls a proposition computational if it's proof can be broken down into parts.
For example, a + (b + c) = (a + b) + c i... (read more)
I should be more careful not to imply I think that we have solid specimens of
computational reward functions; more that I think it's a theoretically important
region of the space of possible minds, and might factor in idealizations of
agency
I come to you with a dollar I want to spend on AI. You can allocate p pennies to go to capabilities and 100-p pennies to go to alignment, but only if you know of a project that realizes that allocation. For example, we might think that GAN research sets p = 98 (providing 2 cents to alignment) while interpretability research sets p = 10 (providing 90 cents to alignment).
Is this remotely useful? This is a really rough model (you might think it's more of a venn diagram and that this model doesn't provide a way of reasoning about t... (read more)
there's a gap in my inside view of the problem, part of me thinks that capabilities progress such as out-of-distribution robustness or the 4 tenets described in open problems in cooperative ai is necessary for AI to be transformative, i.e. a prereq of TAI, and another part of me that thinks AI will be xrisky and unstable if it progresses along other aspects but not along the axis of those capabilities.
There's a geometry here of transformative / not transformative cross product with dangerous not dangerous.
To have an inside view I must be able to adequately navigate between the quadrants with respect to outcomes, interventions, etc.
If something can learn fast enough, then it's out-of-distribution performance
won't matter as much. (OOD performance will still matter -but it'll have less to
learn where it's good, and more to learn where it's not.*)
*Although generalization ability seems like the reason learning matters. So I
see why it seems necessary for 'transformation'.
missed opportunities to build a predictive track record and trump
I was reminiscing about my prediction market failures, the clearest "almost won a lot of mana dollars" (if manifold markets had existed back then) was this executive order. The campaign speeches made it fairly obvious, and I'm still salty about a few idiots telling me "stop being hysterical" when I accused him of being what he's writing on the tin that he is pre inauguration even though I overall reminisce that being a time when my epistemics were way worse than they are now.
Good arguments - notes on Craft of Research chapter 7
Arguments take place in 5 parts.
This can be modeled as a conversation with readers, where the reader prompts the writer to taking the next step on the list.
Claim ought to be supported with reasons. Reasons ought to be based on evidence. Arguments are recursive: a part of an argument is an acknowledgment of an anticipated response, and another argument addresses that response. Finally, when the distance between a claim and a reason grows large, we draw connections with something called warrants.
The logic of warrants proceeds in generalities and instances. A general circumstance predictably leads to a general consequence, and if you have an instance of the circumstance you can infer an instance of the consequence.
Arguing in real life papers is complexified from the 5 steps, because
thoughts on chapter 9 of Craft of Research
We saw previously that claims ought to be supported with reasons, and reasons ought to be based on evidence. Now we will look closer at reasons and evidence.
Reasons must be in a clear, logical order. Atomically, readers need to buy each of your reasons, but compositionally they need to buy your logic. Storyboarding is a useful technique for arranging reasons into a logical order: physical arrangements of index cards, or some DAG-like syntax. Here, you can list evidence you have for each reason or, if you're speculating, list the kind of evidence you would need.
When storyboarding, you want to read out the top level reasons as a composite entity without looking at the details (evidence), because you want to make sure the high-level logic makes sense.
... (read more)Sources - notes on Craft of Research chapters 5 and 6
Primary, secondary, and tertiary sources
The distinction between primary and secondary sources comes from 19th century historians, and the idea of tertiary sources came later. The boundaries can be fuzzy, and are certainly dependent on the task at hand.
I want to reason about what these distinctions look like in the alignment community, and whether or not they're important.
The rest of chapter five is about how to use libraries and information technologies, and evaluating sources for relevance and reliability.
Chapter 6 starts off with the kind of thing you should be looking for while you read
Look for creative agreement
Yesterday I quit my job for direct work on epistemic public goods! Day one of direct work trial offer is April 4th, and it'll take 6 weeks after that to know if I'm a fulltime hire.
I'm turning down
for
nonprosaic ai will not be on short timelines
I think a property of my theory of change is that academic and commercial speed is a bottleneck. I recently realized that my mass assignment for timelines synchronized with my mass assignment for the prosaic/nonprosaic axis. The basic idea is that let's say a radical new paper that blows up and supplants the entire optimization literature gets pushed to the arxiv tomorrow, signaling the start of some paradigm that we would call nonprosaic. The lag time for academics and industry to figure out what's going on, fi... (read more)
I feel like 7 years from AlexNet to the world of PyTorch, TPUs, tons of ML MOOCs, billion-parameter models, etc. is strong evidence against what you're saying, right? Or were deep neural nets already a big and hot and active ecosystem even before AlexNet, more than I realize? (I wasn't paying attention at the time.)
Moreover, even if not all the infrastructure of deep neural nets transfers to a new family of ML algorithms, much of it will. For example, the building up of people and money in ML, the building up of GPU / ASIC servers and the tools to use them, the normalization of the idea that it’s reasonable to invest millions of dollars to train one model and to fab ASICs tailored to a particular ML algorithm, the proliferation of expertise related to parallelization and hardware-acceleration, etc. So if it took 7 years from AlexNet to smooth turnkey industrial-scale deep neural nets and billion-parameter models and zillions of people trained to use them, then I think we can guess <7 years to get from a different family of learning algorithms to the analogous situation. Right? Or where do you disagree?
Excellence and adequacy
I asked a friend whether I should TA for a codeschool called ${{codeschool}}.
A hidden claim there that I would soak up the pursuit of non-excellence by proximity or osmosis isn't what's interesting (though I could see that turning out either way). What's interesting is the value of non-excellence, which I'll call adequacy.
${{codeschool}} in this case is effective and impactful at putting butts in seats at companies, and is thereby re... (read more)
Let
FairBot
be the player that sends an opponent toCooperate
(C
) if it is provable that they cooperate withFairBot
, and sends them toDefect
(D
) otherwise.Let
FairBot_k
be the player that searches for proofs of length<= k
that it's input cooperates withFairBot_k
, and cooperates if it finds one, returning defect if all the proofs of length<= k
are exhausted without one being valid.Critch writes that "100%" of the time, mathematicians and computer scientists report believing that
FairBot_k(FairBot_k) = D
, owing to the basic vision of a stack overf... (read more)I find myself, just as a random guy, deeply impressed at the operational competence of airports and hospitals. Any good books about that sort of thing?
Rats and EAs should help with the sanity levels in other communities
Consider politics. You should take your political preferences/aesthetics, go to the tribes that are based on them, and help them be more sane. In the politics example, everyone's favorite tribe has failure modes, and it is sort of the responsibility of the clearest-headed members of that tribe to make sure that those failure modes don't become the dominant force of that tribe.
Speaking for myself, having been deeply in an activist tribe before I was a rat/EA, I regret I wasn't there to hel... (read more)
Claims - thoughts on chapter eight of Craft of Research
Broadly, the two kinds of claims are conceptual and practical.
Conceptual claims ask readers not to ask, but to understand. The flavors of conceptual claim are as follows:
There's essentially one flavor of practical claim
If you read between the lines, you might notice that a kind of claim of fact or cause/consequence is that a policy work... (read more)
notes (from a very jr researcher) on alignment training pipeline
Training for alignment research is one part competence (at math, cs, philosophy) and another part having an inside view / gears-level model of the actual problem. Competence can be outsourced to universities and independent study, but inside view / gears-level model of the actual problem requires community support.
A background assumption I'm working with is that training as a longtermist is not always synchronized with legible-to-academia training. It might be the case that jr researchers oug... (read more)
Methods, famously, includes the line "I am a descendant of the line of Bacon", tracing empiricism to either Roger (13th century) or Francis (16th century) (unclear which).
Though a cursory wikiing shows an 11th century figure providing precedents for empiricism! Alhazen or Ibn al-Haytham worked mostly optics apparently but had some meta-level writings about the scientific method itself. I found this shockingly excellent quote
... (read more)New discord server dedicated to multi-multi delegation research
DM me for invite if you're at all interested in multipolar scenarios, cooperative AI, ARCHES, social applications & governance, computational social choice, heterogeneous takeoff, etc.
(side note I'm also working on figuring out what unipolar worlds and/or homogeneous takeoff worlds imply for MMD research).
Questions and Problems - thoughts on chapter 4 of Craft of Doing Research
Last time we discussed the difference between information and a question or a problem, and I suggested that the novelty-satisfied mode of information presentation isn't as good as addressing actual questions or problems. In chapter 3 which I have not typed up thoughts about, A three step procedure is introduced
preorders as the barest vocabulary for emergence
We can say "a monotonic map, Φ∈mono(QP) is a phenomenon of P as observed by Q", then, emergence is simply the impreservation of joins.
Given preorders (P,≤P) and (Q,≤Q), we say a map in mono(QP) "preserves" joins (which, recall, are least upper bounds) iff ∀ab∈P,Φa∨QΦb=Φ(a∨Pb) where by "x=y" we mean x≤y∧y≤x.
Suppose Φ is a measurement taken from a particle. We would like for our measurement system to be robust against emergence, which is literally operationalized by measuring one particle, measuring another, t... (read more)
Jotted down some notes about the law of mad science on the EA Forum. Looks like some pretty interesting open problems in the global priorities, xrisk strategy space. https://forum.effectivealtruism.org/posts/r5GbSZ7dcb6nbuWch/quinn-s-shortform?commentId=DqSh6ifdXpwHgXnCG
Ambition, romance, kids
Two premises of mine are that I'm more ambitious than nearly everyone I meet in meatspace and normal distributions. This implies that in any relationship, I should expect to be the more ambitious one.
I do aspire to be a nagging voice increasing the ambitions of all my friends. I literally break the ice with acquaintances by asking "how's your master plan going?" because I try to create vibes like we're having coffee in the hallway of a supervillain conference, and I like to also ask "what harder project is your current project a war... (read more)
Positive and negative longtermism
I'm not aware of a literature or a dialogue on what I think is a very crucial divide in longtermism.
In this shortform, I'm going to take a polarity approach. I'm going to bring each pole to it's extreme, probably each beyond positions that are actually held, because I think median longtermism or the longtermism described in the Precipice is a kind of average of the two.
Negative longtermism is saying "let's not let some bad stuff happen", namely extinction. It wants to preserve. If nothing gets better for the poor or the an... (read more)
The audience models of research - thoughts on Craft of Doing Research chapter 2
Before considering the role you're creating for your reader, consider the role you're creating for yourself. Your broad options are the following
Is there an EV monad? I'm inclined to think there is not, because
EV(EV(X))
is a way simpler structure than a "flatmap" analogue.Would there be a way of estimating how many people within the amazon organization are fanatical about same day delivery ratio against how many are "just working a job"? Does anyone have a guess? My guess is that an organization of that size with a lot of cash only needs about 50 true fanatics, the rest can be "mere employees". What do yall think?
We need a name for the following heuristic, I think, I think of it as one of those "tribal knowledge" things that gets passed on like an oral tradition without being citeable in the sense of being a part of a literature. If you come up with a name I'll certainly credit you in a top level post!
I heard it from Abram Demski at AISU'21.
Suppose you're either going to end up in world A or world B, and you're uncertain about which one it's going to be. Suppose you can pull lever LA which will be 100 valuable if you end up in world A, or you can pull lever LB whi... (read more)
critiques and complaints
I think one of the most crucial meta skills i've developed is honing my sense of who's criticizing me vs. who's complaining.
A criticism is actionable, implicitly often it's from someone who wants you to win. A complaint is when you can't figure out how you'd actionably fix something or improve based on what you're being told.
This simple binary story is problematic. It can empower you to ignore criticism you don't like by providing a set of excuses, if you're not careful. Sometimes it's operationally impossible to parse out a critic... (read more)
hmu for a haskell job in decentralized finance. Super fun zero knowledge proof stuff, great earning to give opportunity.
Are shelling points the occam's razor of mechanism design?
Intuitively I think simplicity is a good explanation for a solution being converged upon.
Does anyone have any crisp examples that violate the schelling point - occam's razor correspondence?
Disvalue via interpersonal expected value and probability
My deontologist friend just told me that treating people like investments is no way to live. The benefits of living by that take are that your commitments are more binding, you actually do factor out uncertainty, because when you treat people like investments you always think "well someday I'll no longer be creating value for this person and they'll drop me from their life". It's hard to make long term plans, living like that.
I've kept friends around out of loyalty to what we shared 5-10 years ago w... (read more)
I may refine this into a formal bounty at some point.
I'm curious if censorship would actually work in the context of blocking deployment of superpowerful AI systems. Sometimes people will mention "matrix multiplication" as a sort of goofy edge case, which isn't very plausible, but that doesn't mean there couldn't be actual political pressure to censor it. A more plausible example would be attention. Say the government threatens soft power against arxiv if they don't pull attention is all you need, or threatens soft power against harvard if their linguistic... (read more)
any literature on estimates of social impact of businesses divided by their valuations?
the idea that dollars are a proxy for social impact is neat, but leaves a lot of room for goodhart and I think it's plausible that they diverge entirely in cases. It would be useful to know, if possible to know, what's going on here.
Why have I heard about Tyson investing into lab grown, but I haven't heard about big oil investing in renewable?
Tyson's basic insight here is not to identify as "an animal agriculture company". Instead, they identify as "a feeding people company". (Which happens to align with doing the right thing, conveniently!)
It seems like big oil is making a tremendous mistake here. Do you think oil execs go around saying "we're an oil company"? When they could instead be going around saying "we're a powering stuff" company. Being a powering stuff company means you hav... (read more)
I've had a background assumption in my interpretation of and beliefs about reward functions for as long as I can remember (i.e. since first reading the sequences), that I suddenly realized I don't believe is written down. Over the last two years I've gained experience writing coq sufficient to inspire a convenient way of framing it.
Computational vs axiomatic reward functions
Computational vs axiomatic in proof engineering
A proof engineer calls a proposition computational if it's proof can be broken down into parts.
For example,
a + (b + c) = (a + b) + c
i... (read more)capabilities-prone research.
I come to you with a dollar I want to spend on AI. You can allocate
p
pennies to go to capabilities and100-p
pennies to go to alignment, but only if you know of a project that realizes that allocation. For example, we might think that GAN research setsp = 98
(providing 2 cents to alignment) while interpretability research setsp = 10
(providing 90 cents to alignment).Is this remotely useful? This is a really rough model (you might think it's more of a venn diagram and that this model doesn't provide a way of reasoning about t... (read more)
Question your argument as your readers will - thoughts on chapter 10 of Craft of Research
Three predictable disagreements are
There are roughly two kinds of queries readers will have about your argument
there's a gap in my inside view of the problem, part of me thinks that capabilities progress such as out-of-distribution robustness or the 4 tenets described in open problems in cooperative ai is necessary for AI to be transformative, i.e. a prereq of TAI, and another part of me that thinks AI will be xrisky and unstable if it progresses along other aspects but not along the axis of those capabilities.
There's a geometry here of transformative / not transformative cross product with dangerous not dangerous.
To have an inside view I must be able to adequately navigate between the quadrants with respect to outcomes, interventions, etc.
testing latex in spoiler tag
Testing code block in spoiler tag
missed opportunities to build a predictive track record and trump
I was reminiscing about my prediction market failures, the clearest "almost won a lot of mana dollars" (if manifold markets had existed back then) was this executive order. The campaign speeches made it fairly obvious, and I'm still salty about a few idiots telling me "stop being hysterical" when I accused him of being what he's writing on the tin that he is pre inauguration even though I overall reminisce that being a time when my epistemics were way worse than they are now.
However, there d... (read more)