Most rationalist fellowships are very good about this (thanks!), but it’s worth saying: The only timezone for deadlines should be anywhere-on-earth.
Err, what do you mean by this? I'd think the deadline should be tied to some day and time, with the time given along with the timezone it's in.
"Applications are due May 18, end of day, anywhere on Earth" means that the application closes whenever it is no longer May 18 anywhere on Earth. According to Wikipedia, this means the deadline is midnight on Howland Island. For anyone else, this guarantees that they'll meet the deadline as long as they submit on May 18 in their local time zone, with an extra grace period of some number of hours depending on their location.
Around July last year I decided I was going to go all in on technical AI safety research. To do that I’d need to get into an AI safety fellowship, quit my job, and sell everything that was in my flat in South Africa (hopefully in that order).
I applied to every fellowship that was open[1], and got rejected from several of them before being accepted into MATS on Team Shard around mid-November. I handed in my notice the following Monday, and told my landlady that I had to move out in 6 weeks because I was leaving the country.
MATS went great! My co-author and I got the first spotlight talk at the MATS Symposium, we’ve submitted to NeurIPS, and I’m more hopeful than ever that I’ll be able to help reduce x-risk.
Things are good, but they could be so much better
I’m hoping that any goodwill I earned from this post can be sacrificed as a peace offering allows me to make suggestions without people assuming I have ill intent. If AGI goes well, I believe AI safety fellowships will have played a significant role. I hope these fellowships can become even more effective than they currently are at putting good people in good places. And I hope this post helps these fellowships be marginally more effective.
Don’t say “we were impressed by your profile” unless you mean it
I got a lot of rejections, and all of them required someone to spend time reviewing my application. Sometimes, quite a lot of time! These rejection letters generally sounded like:
This is nice to hear! But also it comes across as a bit hollow: I would love to see fellowships put some weight behind the words. One option is to actively provide references for rejected (but promising) candidates. I’m imagining something like:
This is a strong letter: it makes falsifiable claims, it makes clear the decision process of the fellowship, and it makes recommendations to other fellowships. It also takes time to write, significantly more time than sending a rejection email. If a candidate is not strong enough that you’d feel comfortable writing this letter, then don’t! But the current status quo around rejection emails is a mixture of appeasement of the candidate and obscurity of the rejection reason, which does not encourage rejected candidates to improve.
This letter does something else: it allows other fellowships to spend less time reviewing promising candidates, if that candidate had already been reviewed by a similar fellowship. And in a complementary fashion, you can spend less time reviewing promising candidates if you receive a letter of recommendation from a fellowship you trust.
Letters of recommendation would increase the capacity of the AI safety field, by reducing the duplicated work done across different fellowships. However, this requires solving a collective action problem and making letters of recommendation the norm, which I suspect is a process only the big players (MATS, Astra, Anthropic Fellows Program) can kick-start.
Make the whole application timeline clear upfront
During the application process, candidates want to know how many rounds of “You ask, I respond” there will be. They want to know the deadlines for all those responses, and how much work is expected for each of their responses.
Not being clear about timelines makes it impossible for candidates to ensure they’ve got free time to complete those applications properly. This is not a case of “good enough candidates will figure it out”: poor communication selects for candidates who have lots of time, rather than for the best candidates. This adds noise to your application process, increasing false positives and false negatives.
If you do not know those deadlines before you open your applications, please communicate your best estimates of these deadlines.
Word limits, character limits, and timed forms
Word limits are good, but please ensure your application form calculates the word count as the candidate is typing, and displays this clearly. Unfortunately, using a word counter is not enough: some applications think that hyphenated words (“fine-tuning”) count as one, others think they count as two. Please tell the candidate what the count is, and tell them before they click submit.
If you can, please quietly accept ~5% over the word limit without penalty. Having a word limit is good: it ensures the candidate can think concisely, and reduces the reviewer burden. I don’t think being <5% over the limit detracts from either of those goals.
In light of the above, I’d recommend having alphanumeric-only character limits instead of word limits or plain character limits. Candidates can cheat word limits by hyphenating words, by playing tricks with the formatting, or otherwise making their response harder to read. That’s terrible! Don’t create selection pressure for hard-to-read responses! Similarly, imposing a character limit encourages responses that are harder to read by removing paragraphs, headings, formatting, _emphasis_, citations, etc.
If you only count alphanumeric characters in your limit, then you’re not penalising candidates for adding paragraphs, headings, _emphasis_, or other punctuation that makes responses easier to review, while still enforcing something similar to a semantic limit on complexity.
Proctored tests are a terrible, terrible time
Mostly because proctoring software is really bad. I’m aware that proctoring solves a real problem, and I’m in favour of the platonically ideal version of proctoring. I imagine that really good software wouldn’t be more stressful than an in-person test, but I’ve never used good proctoring software. Bad proctoring software artificially adds noise to your estimate of a candidate’s quality, meaning you’ll accept people you should have rejected, and reject people you should have accepted. Some specific recommendations:
Make it clear who you’ve accepted in the past
This point is hard to make with certainty, but if you are happy with the fellows you’ve accepted in the past, you should make some information about them public in your applications. For example: alma mater, previous fellowships, previous AI safety experience, education level, gender, age, technical background. The goal here is to encourage promising candidates to apply. Many fellowships have built up an aura of being impossible to get into, and this is only valuable so far as it is true. If a candidate believes the odds are 10,000 to 1 and does not bother applying, this is a loss to the fellowship if the odds were actually 1000 to 1. Applications should encourage candidates who are more likely to get in, and similarly discourage candidates who are unlikely to get in.
Be aware of other fellowships’ deadlines
While one week is enough to complete one fellowship’s round-2 application, it’s certainly not enough to complete multiple fellowships’ round-2 applications.
I’m sure there are reasons for fellowships clustering their deadlines together so that they’re all on the same weekend. But if you choose to do this, please be aware of the cost: the increased workload will cause even very good fellows to appear worse if your deadline is the same as another large fellowship’s deadline. The best candidates will likely ignore a less-prestigious application if its deadline coincides with a more-prestigious one.
If tracking the various fellowship’s deadlines is too much to ask, please consider giving two weekends between announcing a round of interviews and the deadline for those interviews. This allows fellows to give one weekend to one fellowship, and the other weekend to your fellowship. Allowing for two weekends is especially important if you want to receive applications from people with a full-time job and not just bright-eyed bushy-tailed college students.
Be aware of your own deadlines
Most fellowship applications have two tiers: first you apply to the fellowship, and second you apply to the mentor(s) you’d like to work with. If you’re asking for candidates to review many (10+) mentors and then to apply to these mentors, please be aware of how long this could take to do properly. If a candidate wants to spend ~20m reviewing each mentor, that still takes 3 hours(!) to look through them all. This is before the candidate has started to answer the mentor’s questions. When deciding on deadlines, be aware that even deciding which mentors to apply for takes a long time.
If possible, make the list of mentors (and their biographies, research interests, etc) available publicly and early so that candidates can start reviewing the mentors as soon as possible. Gating this information behind the first round of interviews just reduces the quality of applications that the mentors end up reviewing, since candidates have less time to consider which mentors to apply to.
Fellowships compete for fellows
If your fellowship is clear and upfront about what’s expected of the candidates during applications, gives reasonable deadlines and makes reasonable accommodations, then your fellowship will have better fellows applying to it! Just as candidates compete for limited places in fellowships, the best candidates will have multiple choices for which fellowship to attend. It’s in your best interests to ensure the best candidates get through to your final round.
Anywhere-on-earth is the only timezone that matters
Most rationalist fellowships are very good about this (thanks!), but it’s worth saying: The only timezone for deadlines should be anywhere-on-earth.
Your application is probably northern-hemisphere centric
It’s very weird seeing DEI disclaimers about equality and then having the applications be in “winter” while it’s a bright summer’s day outside.
Astra, CAIRF, ERA, MARS (multiple streams), MATS (multiple streams), Pivotal, and SPAR.