While I sort through whatever is happening with GPT-4, today’s scheduled post is two recent short stories about restaurant selection.

Ye Olde Restaurante

Tyler Cowen says that restaurants saying ‘since year 19xx’ are on net a bad sign, because they are frozen in time, focusing on being reliable.

Visualize a classic, old-fashioned steakhouse that exudes a timeless charm, reminiscent of the early 20th century. Picture dark wood paneling on the walls, rich leather booths, and white linen tablecloths. Antique chandeliers hang from the ceiling, casting a warm, inviting glow over the diners. The atmosphere is cozy and refined, with vintage steak knives and silverware set on the tables. Aged portraits and historical memorabilia adorn the walls, telling the story of decades of culinary tradition. The steakhouse should look like it has been a beloved institution for generations, offering a nostalgic glimpse into a bygone era of dining elegance.

For the best meals, he says look elsewhere, to places that shine brightly and then move on.

I was highly suspicious. So I ran a test.

I checked the oldest places in Manhattan. The list had 15 restaurants. A bunch are taverns, which are not relevant to my interests. The rest include the legendary Katz’s Delicatessen, which is still on the short list of very best available experiences (yes, of course you order the Pastrami), and the famous Keen’s Steakhouse. I don’t care for mutton, but their regular steaks are quite good. There’s also Peter Lugar’s and PJ Clarke’s. There were also two less impressive steakhouses. Old Homestead is actively bad, and Delmonico’s was a great experience because we went to The Continental and then to John Wick 3 but is objectively overpriced without being special.

Those are all ‘since 18xx,’ so extreme cases. What about typical cases?

Unfortunately, getting opening date data is tricky. Other lists I found did not actually correspond to when places opened all that well. I wasn’t able to easily test more systematically. Looking at ‘places you like the most’ has obvious bias issues. The one I love most opened in 1978. Others were newer but mostly not that recent. However I’ve had a lifetime to find them, so a question is, how fast and completely do I evaluate new offerings?

My guess is that:

  1. Most new restaurants are below average, and also rather uninteresting. The average new (non-chain) restaurant is higher quality now than in the past, but it is also less interesting.
  2. Average (mean or median) quality increases with age, at least initially, due to positive selection via survivorship. If a place folds quickly, you usually did not miss very much.
  3. Older places are selected for because they reward repeat business and being a regular. Thus, if you are trying places in your area, you should be sure to try such places, because there could be high value in being that regular. But in your area you should be checking out essentially all plausible options over time. The primary question to ask is, what is the upside of trying this? The best upside is not the best one time experience, it is a place you can add to your rotation that brings something new to the table.
  4. A place that survives will on average become a slightly worse choice over time, as the alternatives improve and it attempts to largely stay the same, once they get the kinks out in their early period.
  5. The places you love, in particular, will get worse for you, in particular, over time, because any change to them or to you will tend to be bad for the match, and also alternatives will improve.
  6. The very best food experiences require novelty of some kind from your perspective, it is true. So there is a certain kind of experience for which you want to try the new, but you have to be in strong exploration mode.
  7. But also very old restaurants often do something unique or uniquely well and have survived because of it, so they can offer a unique experience as well. The most unique things won’t be so old, but on average older things will be more unique.
  8. You can get a better advance read on older places than you can on new places.
  9. In expectation, all else being equal, selection effects dominate, older has higher EV. This is true even if your priority ‘your experience today, right now.’
  10. However all else is not equal, and the more additional filtering work you do the more you should end up going to relatively new places.

Ye Newfangled Restaurante

A whimsical illustration capturing the essence of the eternal search for the next great restaurant. The scene is vibrant and bustling, featuring a diverse group of people with maps, smartphones, and guidebooks, wandering through a stylized cityscape filled with a variety of restaurants, bistros, and food stalls. The restaurants should display a range of cuisines from different cultures, with inviting facades and aromas wafting into the streets. Above the city, a giant magnifying glass hovers, symbolizing the quest for culinary discovery, with a trail of sparkles leading to a hidden gem of a restaurant that stands out among the rest.

The good news here is that I strongly think Emmett Shear is centrally wrong.

Nick: I hate how well DoorDash ratings correlate with the restaurants I spent 10yrs searching out all the hidden gems i had are 4.9 and only the only false positive is Sweet Maple.

Emmett Shear: Yelp has destroyed the joy of exploration and discovery in exchange for efficiency and quality, and I’m not sure it’s a good trade in the end. Yes, I know I could just not look. But knowing it’s there and I could just look makes trying and it turning out meh just feel bad.

Zvi: This is so bizarre to see. Yelp ratings seem awful to me I use Google Maps instead, but beyond that it is the ratings that enable exploration to be worthwhile. You learn what is worth exploring! It’s great. Another tactic that you can use that I enjoy sometimes: Explore, then once you are physically looking at the place and it looks promising, check online before actually going in, and to get ideas on what to order. Best of both worlds.

Sophia: people claim often that the overall quality improvements over decades from yelp making it hard to run a bad restaurant are huge, which seems really good to me.

Emmett Shear later clarified that his theory is that Yelp is good when hipsters dominate the rankings inputs, but poor when tourists do so.

Exploration and discovery is vastly better, easier and more rewarding in the review era than it was in the pre-review era. The joy is higher, not lower. It is your choice how much exploration you still want to do versus exploitation, and how many risky ‘hidden gems’ you want to seek out and test. As I note, one good tactic is to literally walk the streets anyway, see what is available, only then use online to verify.

Also, one can test Nick’s theory that the ratings are actually indicative.

As a baseline, let’s use this market as a source of places that I think were valuable to find, plus anything I pick up along the way that seems like an oversight on that list. Any exploration procedure should place a high priority in find them. Google Maps ratings will often fail entirely to differentiate these places from other similar places that I like less. The ratings are highly valuable, but they do not let you not do the work, especially for ethnic restaurants, where ‘do they handle delivery and customer service well’ is a huge portion of the rating.

We also want to check false positives. That is mostly rarer, if you have an exceptional rating you are probably good, but it does not reliably make you great on number alone.

So let’s check. Will DoorDash or Yelp do better? I am doing this hungry.

The average rating on DoorDash of the places that were there was 4.64. That is a good rating. But it is not an exceptional rating. The default filter is to only show you places at 4.5 or higher. The signal here seems to be filtering out of places that have big issues, but it does not seem good at identifying exceptional things. The places the app was suggesting were not differentiable via rating.

What about Yelp? I had it search my area. Of the first 10 hits, there was one legit hit, and multiple places I know are mediocre, but also they are clearly not sorting by rating there. Sorting by highest rating got a bunch of places with a small number of ratings that I did not recognize.

When I looked at the Yelp ratings of my top places, the ratings of the top half of that list (5.0s) averaged 4.0, and the ratings of the bottom half of that list (not 5.0s) also averaged 4.0. There did not seem to be a pattern based on whether tourists would dominate. My model continues to say that Yelp has its finger on the scale, and that is why the ratings are not so useful, but to be clear I do not have proof.

Looking at specific places made it very clear, once again, that Yelp ratings are worthless. They do not even have vague agreement among different outposts of the same chain (Naya) where I have always had entirely undifferentiated experiences.

Certainly none of this constitutes sufficiently good evidence that one can afford to cease exploring. Or on the flip side, be tempted into foregoing the joys of exploration. You still have to use your wits, learn to read the signs, adjust for your preferences, and then eat around and find out.

New to LessWrong?

New Comment