I found pair programming pretty useful when starting a new project from scratch, when changes are likely to be interdependent. It is then better to work with, let's say, 1.5x the performance of a single developer on one thing a time, than to work separately and then try to reconcile the changes. Knowledge transfer is also very important at this stage (you get more people with the same vision of the fundamentals).
This generalizes to other cases when there is a "narrow front" - when few things can be worked on in parallel without stepping on each other's toes.
Even more generally, it seems there are three kinds of clear benefits:
1) Less change synchronization (fewer changes worked on at the time).
2) Knowledge transfer (see @FeepingCreature's answer).
3) Immediate, detailed review - probably fewer defects.
There is also a matter of raw throughput (or how much time is required to make a specific change, while the rest of the code is assumed to stay the same, ignoring the cost of syncing with any changes done in parallel). A naive baseline is that a pair has a throughput of a single developer (since they're working on one change at a time). Fortunately, it can be way better, because one person can just focus on the details on the code and the other on the slightly bigger picture and next steps, look up the relevant facts from the documentation etc. This eliminates a lot of context switching and limits the number of things that each developer needs to keep in working memory. Also a lot of typos and other simple problems get caught immediately, so there is less debugging to do. It's not so clear, what all of this stuff adds up to.
I was able to find some studies about the topic, including a meta-analysis by Hannay et al. TL;DR: it depends on the situation, including how experienced are the developers and how complex is the task). It's clearly not a silver bullet and generally it still seems to be a trade-off between person-hours spent and the quality of the produced software.
One thing I feel strongly about is that pair-programming is a skill, and that if both participants aren't skilled at it, things will be rough. When I was at my coding bootcamp, they'd always separate us into pairs and have us pair program on the exercises they'd give us. It was terrible. The driver would always just go off on their own.
My experience with pair programming has been mostly positive. I've taken both roles, but more often as the navigator.
Incentives in school are different than at a real company: Individual grades vs the success of the company that's paying your salary; getting along with people you may never see again after this semester vs approval of peers you may be working with for years. Also, the kind of people in a coding bootcamp are not necessarily the kind of people who would actually get hired to do programming.
Junior/senior pairings seem to work best when the senior is navigating and the junior is driving. Yes, it helps if the senior is patient and the junior is willing, but the junior is not particularly skilled at programming generally.
Found a long dissertation on the topic: https://collaboration.csc.ncsu.edu/laurie/Papers/dissertation.pdf
From the abstract:
The abstract also mentioned enhanced skills and teamwork as benefits.
It's easy to play armchair statistician and contribute little, but I want to point out that the empirics cited here are effectively just anecdotes. The paper studies 13 pairs and 13 individuals in three assignments in one class at UUtah. Its estimate of relative time costs is only significant to ~1σ because development time has variance of (if I backsolved correctly) 65%, which...seems about right. Still, it seems like borderline abuse of frequentist statistics to argue that a two-tailed p<0.05 should be required to reject the hypothesis that pairs finish projects in half the wall-clock time of individuals (which is the null the analysis assumes).
That said, the author correctly identifies that quality matters significantly more than speed. The quality metric, however, is "assignment tests passed" in throwaway academic projects, eliding the questions of what quality failures would or wouldn't be caught by the review / CI workflows that an industrial project would be going through anyway.
So, finger to the wind, this study feels like it suggests that a pair spends 15% more person-hours (once they get used to each other) before turning their schoolwork in, and do 15% more of the work of the assignment than a student working alone. Consistent with the higher reported work-enjoyment numbers! Definitely a stronger showing than I would have guessed! But definitely not well-abstracted by "no significant result for time; significant improvement for quality".
What am I missing here?
I only had one experience with pair programming, and it was a positive one. Both in terms of emotional satisfaction, and productivity.
But I suspect that it is a pleasant experience when you are paired with a person you would otherwise enjoy talking to about programming. Because that's more or less what it is, except you are also producing the code you talk about.
If I had to pair-program with a person I don't "click" with -- either because of personality, or because of wildly different opinions on what is the desirable way to write code -- I can imagine it could become a form of torture. (But that's just a guess; I didn't have an opportunity to try.)
For this reason, I imagine the answer to whether "pair-programming is better" would depend on many things. How compatible are the team members? Are you allowed to choose your pair, or do you get one assigned against your will? (What happens when one member of the pair takes a vacation?)
But talking openly about personal compatibility is something I can hardly imagine in a workplace. I mean, jobs are usually hierarchical, hierarchical environment is antithetical to sincerity, expressing your true feelings could be taken as unprofessional behavior; so you could get people reporting that they "click" with everyone (or everyone high-status) just because they want to be seen as "team players", or because they want to be paired with someone highly productive so that their pair productivity will also be high.
In summary, I imagine the proper research would need to take personal compatibility into account, but there are incentives to provide wrong information. The research would have to address this.
I had a pair programming experience at my first job back in the late 80s, before it was a thing, and my coworker and I clicked well, so it was fun while it lasted. Never had a chance to do it again, but miss it a lot. Wish I could work at a place where this is practiced.