All of Peter Wildeford's Comments + Replies

Does the chance evolution got really lucky cancel out with the chance that evolution got really unlucky? So maybe this doesn't change the mean but does increase the variance?as for how much to increase the variance, maybe like an additional +/-1 OOM tacked on to the existing evolution anchor?

I'm kinda thinking there's like a 10% chance you'd have to increase it by 10x and a 10% chance you'd have to decrease it by 10x. But maybe I'm not thinking about this right?

2Ege Erdil2mo
The problem with the "evolution got really unlucky" assumption is the Fermi paradox. It seems like to resolve the Fermi paradox we basically have to assume that evolution got really lucky at least at some point if we assume the entire Great Filter is already behind us. Of course in principle it's possible all of this luck was concentrated in an early step like abiogenesis which AI capabilities research has already achieved the equivalent of, and there was no special luck that was needed after that. The important question seems to be whether we're already past "the Great Filter" in what makes intelligence difficult to evolve naturally or not. If the difficulty is concentrated in earlier steps then we're likely already past it and it won't pose a problem, but e.g. if the apes -> humans transition was particularly difficult then it means building AGI might take far more compute than we'll have at our disposal, or at least that evolutionary arguments cannot put a good bound on how much compute it would take. The counterargument I give is that Hanson's model implies that if the apes -> humans transition was particularly hard then the number of hard steps in evolution has to be on the order of 100, and that seems inconsistent with both details of evolutionary history (such as how long it took to get multicellular life from unicellular life, for example) and what we think we know about Earth's remaining habitability lifespan. So the number of hard steps was probably small and that is inconsistent with the apes -> humans transition being a hard step.

There are a lot of different ways you can talk about "efficiency" here. The main thing I am thinking about with regard to the key question "how much FLOP would we expect transformative AI to require?" is whether, when using a neural net anchor (not evolution) to add a 1-3 OOM penalty to FLOP needs due to 2022-AI systems being less sample efficient than humans (requiring more data to produce the same capabilities) and with this penalty decreasing over time given expected algorithmic progress. The next question would be how much more efficient potential AI (... (read more)

1jhoogland3mo
To me this isn't clear. Yes, we're better one-shot learners, but I'd say the most likely explanation is that the human training set is larger and that much of that training set is hidden away in our evolutionary past. It's one thing to estimate evolution FLOP (and as Nuño points out, even that is questionable [https://nunosempere.com/blog/2022/08/10/evolutionary-anchor/]). It strikes me as much more difficult (and even more dubious) to estimate the "number of samples" or "total training signal (bytes)" over one's lifetime / evolution.

Yeah ok 80%. I also do concede this is a very trivial thing, not like some "gotcha look at what stupid LMs can't do no AGI until 2400".

This is admittedly pretty trivial but I am 90% sure that if you prompt GPT4 with "Q: What is today's date?" it will not answer correctly. I think something like this would literally be the least impressive thing that GPT4 won't be able to do.

8gwern4mo
Are you really 90% sure on that? For example, LaMDA apparently has live web query access (a direction OA was also exploring with WebGPT), and could easily recognize that as a factual query worth a web query, and if you search Google for "what is today's date?" it will of course spit back "Monday, August 22, 2022", which even the stupidest LMs could make good use of. So your prediction would appear to boil down to "OA won't do an obviously useful thing they already half-did and a competitor did do a year ago".

Is it ironic that the link to "All the posts I will never write" goes to a 404 page?

3adamShimi4mo
It's a charitable (and hilarious) interpretation. What actually happened is that he drafted it by mistake instead of just editing it to add stuff. It should be fine now.
5Owain_Evans6mo
We didn't try but I would guess that finetuning on simple math questions wouldn't help with Metaculus forecasting. The focus of our paper is more "express your own uncertainty using natural language" and less "get better at judgmental forecasting". (Though some of the ideas in the paper might be useful in the forecasting domain.)

This sounds like something that could be done as an organization creating a job for it, which could help with mentorship/connections/motivation/job security relative to expecting people to apply to EAIF/LTFF

My organization (Rethink Priorities) is currently hiring for research assistants and research fellows (among other roles) and some of their responsibilities will include distillation.

These conversations are great and I really admire the transparency. It's really nice to see discussions that normally happen in private happen instead in public where everyone can reflect, give feedback, and improve their own thoughts. On the other hand, the combined conversations combined to a decent-sized novel - LW says 198,846 words! Is anyone considering investing heavily in summarizing the content for people to get involved without having to read all that content?

6Daniel Kokotajlo9mo
Here is a heavily condensed summary of the takeoff speeds thread of the conversation, incorporating earlier points made by Hanson, Grace, etc. https://objection.lol/objection/3262835 [https://objection.lol/objection/3262835] :) (kudos to Ben Goldhaber for pointing me to it)

Echoing that I loved these conversations and I'm super grateful to everyone who participated — especially Richard, Paul, Eliezer, Nate, Ajeya, Carl, Rohin, and Jaan, who contributed a lot.

I don't plan to try to summarize the discussions or distill key take-aways myself (other than the extremely cursory job I did on https://intelligence.org/late-2021-miri-conversations/), but I'm very keen on seeing others attempt that, especially as part of a process to figure out their own models and do some evaluative work.

I think I'd rather see partial summaries/respons... (read more)

7Ben Pace9mo
I chatted briefly the other day with Rob Bensinger about me turning them into a little book. My guess is I'd want to do something to compress especially the long Paul/Eliezer bet hashing out, that felt super long to me and not all worth the reading. Interested in other suggestions for compression. (This is not a commitment to do this, I probably won't.)

I don't recall the specific claim, just that EY's probability mass for the claim was in the 95-99% range. The person argued that because EY disagrees with some other thoughtful people on that question, he shouldn't have such confidence.

 

I think people conflate the very reasonable "I am not going to adopt your 95-99% range because other thoughtful people disagree and I have no particular reason to trust you massively more than I trust other people" with the different "the fact that other thoughtful people mean there's no way you could arrive at 95-99% confidence" which is false. I think thoughtful people disagreeing with you is decent evidence you are wrong but can still be outweighed.

I can see whether the site is down or not. Seems pretty clear.

1Forged Invariant1y
Just be aware that other users have already noticed messages which could be deliberate false alarms: https://www.lesswrong.com/posts/EW8yZYcu3Kff2qShS/petrov-day-2021-mutually-assured-destruction?commentId=JbsutYRotfPDLNskK [https://www.lesswrong.com/posts/EW8yZYcu3Kff2qShS/petrov-day-2021-mutually-assured-destruction?commentId=JbsutYRotfPDLNskK]
1MichaelStJules1y
I don't think you'll be able to retaliate if the site is down.
1[comment deleted]1y

Attention LessWrong - I am a chosen user of EA Forum and I have the codes needed to destroy LessWrong. I hereby make a no first use pledge and I will not enter my codes for any reason, even if asked to do so. I also hereby pledge to second strike - if the EA Forum is taken down, I will retaliate.

Regarding your second strike pledge: it would of course be wildly disingenuous to remember Petrov's action, which was not jumping to retaliation, by doing the opposite and jumping to retaliation.

I believe you know this, and would guess that if in fact one of the sites went down, you'd do nothing but instead later post about your moral choice of not retaliating.

(I'd also guess, if you choose to respond to this comment, it'd be to reiterate the pledge to retaliate, as you've done elsewhere. This does make sense--threats must be unequivocal to be believed, e... (read more)

4Neel Nanda1y
Mutual Assured Destruction just isn't the same when you can see for sure whether you were nuked

Seems like "the right prompt" is doing a lot of work here. How do we know if we have given it "the right prompt"?

Do you think GPT-4 could do my taxes?

1Michaël Trazzi2y
re right prompt: GPT-3 has a context window of 2048 tokens, so this limits quite a lot what it could do. Also, it's not accurate at two-digit multiplication (what you would at least need to multiply your $ to %), even worse at 5-digit. So in this case, we're sure it can't do your taxes. And in the more general case, gwern wrote some debugging steps [https://www.gwern.net/GPT-3#effective-prompt-programming] to check if the problem is GPT-3 or your prompt. Now, for GPT-4, given they keep scaling the same way, it won't be possible to have accurate enough digit multiplication (like 4-5 digits, cf. this [https://www.lesswrong.com/posts/tt7WtqiEyEiLmAecZ/what-will-gpt-4-be-incapable-of?commentId=sBDde2ZvMqidazDrG] thread) but with three more scalings it should do it. Prompt would be "here is a few examples on how to do taxe multiplication and addition given my format, so please output result format", and concatenate those two. I'm happy to bet $1 1:1 on GPT-7 doing taxe multiplication to 90% accuracy (given only integer precision).

1.) I think the core problem is that honestly no one (except 80K) actually is investing significant effort on growing the EA community since 2015 (especially comparable to the pre-2015 effort and especially as a percentage of total EA resources)

2.) Some of these examples are suspect. The GiveWell numbers definitely look to be increasing beyond 2015, especially when OpenPhil's understandably constant fundraising is removed - and this increase in GiveWell seems to line up with GiveWell's increased investment in their outreach. The OpenPhil numbers also look ... (read more)

1AppliedDivinityStudies2y
This is great, thanks! Wish I had seen this earlier.

Mr. Money Mustache has a lot of really good advice that I find a lot of value from. However, I think Mr. Money Mustache underestimates the ease and impact of opportunities to grow income relative to cutting spending - especially if you're in (or can be in) a high-earning field like tech. Doubling your income will put you on a much faster path than cutting your spending a further 5%.

2Adam Zerner2y
Yeah, that makes a lot of sense.

PredictionBook is really great for lightweight, private predictions and does everything you're looking for. Metaculus is great for more fully-featured predicting and I believe also supports private questions, but may be a bit of overkill for your use case. A spreadsheet also seems more than sufficient, as others have mentioned.

Thanks. I'll definitely aim to produce them more quickly... this one got away from me.

My understanding is that we also have and might in the future also spend a decent amount of time in a "level 2.5", where some but not all non-essential businesses are open (i.e., no groups larger than ten, restaurants are closed to dine-in, hair salons are open).

A binary search strategy still could be more efficient, depending on the ratio of positives to negatives.

2Steven Byrnes3y
Don't forget there could be many positives per pool...

Not really an answer, but a statement and a question - I imagine this is literally the least neglected issue in the world right now. How much does that affect the calculus? How much should we defer to people with more domain expertise?

3Kenny3y
We should defer to people with more domain expertise exactly as much as we would normally do (all else being equal). Almost all of what's posted to and discussed on this site is 'non-original work' (or, at best, original derivative work). That's our comparative advantage! Interpreting and synthesizing other's work is what we do best and this single issue affects both every regular user and any potentially visitor immensely. There's no reason why we can't continue to focus long-term on our current priorities – but the pandemic affects all of our abilities to do so and I don't think any of us can completely ignore this crisis.

It could also be on the list of pros, depending on how one uses LW.

I feel obligated to note that it will in fact only destroy the frontpage of LW, not the rest of the site.

Are you offering to take donations in exchange for pressing the button or not pressing the button?

1Ramiro P.3y
I thought he was being ambiguous on purpose, so as to maximize donations.
1William_S3y
I think the better version of this strategy would involve getting competing donations from both sides, using some weighting of total donations for/against pushing the button to set a probability of pressing the button, and tweaking the weighting of the donations such that you expect the probability of pressing the button will be low (because pressing the button threatens to lower the probability of future games of this kind, this is an iterated game rather than a one-shot).
7jefftk3y
I would give someone my launch codes in exchange for a sufficiently large counterfactual donation. I haven't thought seriously about how large it would need to be, because I don't expect someone to take me up on this, but if you're interested we can talk.

What happens if you don't check off everything for the day?

3VipulNaik4y
That's a normal part of life :). Any things that I decide to do in a future day, I'll copy/paste to over there, but I usually won't delete the items from the checklist for the day where I didn't complete them (thereby creating a record of things I expected or hoped to do, but didn't). For instance, at https://github.com/vipulnaik/daily-updates/issues/54 [https://github.com/vipulnaik/daily-updates/issues/54] I have two undone items.

This sounds fairly similar to being on a board of a non-profit.

Nice post. I'd be curious to hear what all the monthly themes were.

I don't think so. The second equation is negative infinity for karma = 0, which seems not right.

4philh5y
It sounds like you're agreeing with Unnamed but think that you're disagreeing? (Another minor piece of evidence that the second equation is wrong, is that it could be more simply written floor(log_5(karma))+2.)
3Raemon5y
I believe the equation was constructed specifically to avoid that scenario (but also I don't know the math to check it myself)

That makes sense. On a mostly unrelated note, is there any way to get notified when someone replies to my comment?

3Raemon5y
Quick check: if you click on the Bell icon at the top-right, do you see any notifications about replies? (Last I checked, notifications appeared, we just hadn't created a thing to turn the bell red or anything when they happen)

Tbis appeared on https://www.lesserwrong.com/daily

My karma from the old LessWrong did not port over. Is this normal?

8Raemon5y
It's a temporary issue (nobody's karma has ported over as-of-now). It turns out migrating millions of upvotes from one system to another is hard.
3Said Achmiz5y
Ditto.

Have you thought about Vimium instead of Karabiner?

0paulfchristiano5y
I also use vimium, but there are lots of things it doesn't cover.

Thanks for the feedback.

I added a paragraph to above saying: "We're also using this as a way to build up the online EA community, such as featuring people on a global map of EAs and with a list of EA Profiles. This way more people can learn about the EA community. We will ask you in the survey if you would like to join us, but you do not have to opt-in and you will be opted-out by default."

4fubarobfusco6y
Thank you.

Why do you think this? The outside view suggests this won't happen -- disclosing success and failure is uncommon in the non-profit space.

0casebash6y
A major proportion of the clients will be EAs

It seems to have had consequences for at least one poster (namely, the OP).

0gjm6y
Sure. That seems like a slender thread of evidence on which to hang any sort of general claim, though.

I think we should change this, because a lack of fixed rules makes LW pretty hard to use and helps keep it dead.

2gjm6y
It's not clear to me that a lack of fixed rules has that consequence. Why do you think that?

This is pretty cool -- I like the write-up. I don't mean to pry into your life, but I would find it interesting to see an example of how you answer these questions. It would help me internalize the process more.

0Elo6y
I had about a thousand words of examples that I had generated but they were really rubbish and irrelevant. ( http://bearlamp.com.au/the-ladder-of-abstraction-and-giving-examples/ [http://bearlamp.com.au/the-ladder-of-abstraction-and-giving-examples/]) Exploration for me at the moment is spending lots of time having rationality conversations with people - usually in person. Then when I generate great ideas and insights and solutions to problems that people have I come home and exploit that knowledge by writing it down and sharing it with people and honing my set of published writing.

What category does writing posts go under? I'm impressed you can do a day job, write posts, and still have a lot of messing around time! :)

0Elo6y
I am so glad you said so because it means I can be delighted to call your attention to this post that I have previously written http://bearlamp.com.au/the-ladder-of-abstraction-and-giving-examples/ [http://bearlamp.com.au/the-ladder-of-abstraction-and-giving-examples/] and to apologise because that's not actually my list of time use, it was an example because my time is not a conventional schedule and hard to relate to. I can however give you a real example: in winter this year (Australia so 6 months ago), I was hitting a functional wall at 10pm, and only getting into the groove at 9am. Right now in summer it's more like 6am-2am (yes 4 hours of sleep several days this week means there are rumours I am a robot among my close family and friends) for winter: 7am wake up 7-9 play on the slack, get out of bed and go for a run. 9-9:30 shower and sit at my desk 9:30-1 gap (3.5) 1-2 lunch with lachlan 2-5:30 gap (3) 5:30-9 biohack sydney meetup group 9-9:30 go home 9:30-10 consider work 10 sleep In this day I had 6.5hrs in which I would have filled with things like emails, lesswrong, writing, research and any other type of work I had on. That's remarkably little time to work on things, and I got a lot less done than I am now. This is also why I would advocate for optimum diet and exercise routine to give you as much time and energy as you possibly can.

10:20-1 work meeting (1hr40mins)

Still nitpicking, 10:20-1 is 2hr40min.

0Elo6y
fixed.
Load More