When I was 24 I had a hard time getting a job as a software developer. As an self-taught engineer, I had no credentials. I was bad at writing resumes and cover letters. And I was bad at interviewing. Then I read Hiring is Obsolete.

If you start a startup, you'll probably fail. Most startups fail. It's the nature of the business. But it's not necessarily a mistake to try something that has a 90% chance of failing, if you can afford the risk. Failing at 40, when you have a family to support, could be serious. But if you fail at 22, so what? If you try to start a startup right out of college and it tanks, you'll end up at 23 broke and a lot smarter. Which, if you think about it, is roughly what you hope to get from a graduate program.

Even if your startup does tank, you won't harm your prospects with employers. To make sure I asked some friends who work for big companies. I asked managers at Yahoo, Google, Amazon, Cisco and Microsoft how they'd feel about two candidates, both 24, with equal ability, one who'd tried to start a startup that tanked, and another who'd spent the two years since college working as a developer at a big company. Every one responded that they'd prefer the guy who'd tried to start his own company. Zod Nazem, who's in charge of engineering at Yahoo, said:

"I actually put more value on the guy with the failed startup. And you can quote me!"

So there you have it. Want to get hired by Yahoo? Start your own company.

"Hey," I thought, "I'm 24. I can game the system! If I start a startup with the deliberate intention to fail after a few months then I can get hired as a software developer."

I'll be 28 next week. On the one hand, things are going well. On the other hand, I'm still waiting for Yahoo's recruitment email.

Here is a list of the biggest things that surprised me about starting a startup.

1. It's easier than I expected

Keeping a startup alive is harder than startup founders expect. Otherwise there wouldn't be a 90% failure rate.

When I was a teenager, "luxury vacation" meant sleeping in a campsite. "Typical vacation" meant sleeping with a machete for self-defense.

When I was 24, I flew to Shanghai. I rented the cheapest apartment I could find. To get there you take the subway as far west as it will go. Get off the subway and walk another half-mile west. On your left is an abandoned shopping mall with boarded-up stores and broken escalators. On the right is a large compound with a broken turnstile at the entrance. Walk to the the house at the back of the compound. I slept in the kitchen cupboard.

If you grow up in Asia or Africa, or even in poverty in the USA, experiences like this are the norm. But they're unusual for Computer Science graduates. So when Paul Graham says "The best way to put it might be that starting a startup is fun the way a survivalist training course would be fun" followed by "When I look at the responses, the common theme is that starting a startup was like I said, but way more so", I have to reverse this advice.

2. Investors don't like hardware

Investors don't like hardware startups. According to Paul Graham "Out of 84 companies [in YC], 7 were making hardware. On the whole they've done better than the companies that weren't." According to Paul Graham this overperformance is evidence that YC is biased against hardware companies.

YC isn't alone here. I recently had a surreal conversation with an investor. He basically said "I love your team and you're making lots of money selling hardware, but you can't make billions of dollars selling hardware because Apple will crush you. Besides, Google just bought FitBit for billions of dollars. That means you can't make billions of dollars starting a hardware startup. Even if you could, venture capitalists would never fund you and I can't fund you because they won't."[1]

The best thing about selling hardware is it provides immediate revenue. If investors won't fund us then we can bootstrap everything. If those same investors won't fund our competitors then we can take our time.

3. Hardware is easier and cheaper than I expected

Starting a hardware startup is easier and faster than I expected in every respect. It's not just us. Experienced investors recently estimated that such-and-such part would cost $40,000 to make. My CEO got it done for $12,000 in a rush order.

Maybe this is because my CEO and I speak Chinese and our family is from the Republic of China. Maybe we're unusually scrappy engineers. Or maybe hardware has gotten cheaper recently and the market hasn't caught up yet.

4. Lisp is powerful

Machine learning is a core part of our product. We wrote a system to automate hyperparameter search. Originally this was written in Python, but as it got more and more meta, we ported most of it to Lisp. Within a few months, we had a general-purpose system for hyperparameter search with layered caching for small data[2]. While it is possible to write this sort of thing in Python, it would have been prohibitively expensive.

5. Younger is better

When I started this company I had 1.5 years of professional software development experience plus 1 year of working part-time at a physics lab where I programmed computers in-between cutting sheet metal and calibrating gamma ray detectors. I was afraid someone middle-aged with decades more experience and savings would crush us. I had it all backward.

It's true that the amount of money you have increases with age, but so do your expenses. Expenses are more important than savings in the startup game. A hypothetical 22-year-old with $25k/year expenses and 1 year of runway has an advantage over a 40-year-old with $150k/year expenses and 2 years of runway. The 22-year-old needs only 25k/year to hit ramen profitability. The 40-year-old needs $150k/year.

This applies even below ramen profitability. 20k/year gives the 40-year-old an extra 2 months of runway. The same revenue of 20k/year increases the 22-year-old's runway by 4 years.

What about experience?

When you're starting a startup, you have to do lots of different things. I've had to write full-stack web apps, native Android apps, native iOS apps, smartwatch apps, firmware for microcontrollers, machine learning systems and a compiler—and that's just my software development duties.

There is no way to know all these things in advance. You have to be adaptable. Older people are not more adaptable than younger people.

Lastly is the effect on one's career. A 22-year-old who flies a startup into the ground has just jumpstarted zir career in software development. A 40-year-old who flies a startup into the ground has postponed zir retirement. A 22-year-old who becomes a billionaire gets to enjoy it for the rest of zir life. A 40-year-old who becomes a billionaire should have done it before working a lower-paying job for the last 18 years.

6. Change

The longer I run a startup, the more I feel my personality drifting away from my engineering friends. They like software less and less every year. Meanwhile, I love software just as much as when I was a teenager, except now I have good taste and can write software instead of just admiring others'.

My friends also seem increasingly docile. This one isn't them changing. It's me. This is disconcerting, even though 24-year-old me was just as docile as my friends are right now.

Maybe I was always destined to be an entrepreneur. I don't know. I never thought of myself as a "business person". Neither did my friends. But back when I was 24, a CEO asked me the following question before rejecting my job application.

Would you ever be happy working for someone else?

"Of course!" I replied, "That's why I'm applying to work for you!" I guess he noticed something back then. Maybe it had something to do with how my cover letter started "I built a competitor to your company last week...."

  1. I appreciate how this MBA was direct about the irrationality of the situation. ↩︎

  2. Small data is machine learning with strong Bayesian priors. The most most lucrative application of small data is alpha seeking in quantitative finance. We've considered open-sourcing our hyperparameter search tool. Please PM me if you're interested in discussing this, especially if you work in quantitative finance or a similar industry. ↩︎

New Comment
12 comments, sorted by Click to highlight new comments since:

The price of being a dog is comfortable, sociable boredom. The price of being a wolf is the freedom of loneliness and uncertainty.

Obviously, only the wolves that survive.

The global dog population is estimated at 900 million. There are two species of wild wolves: red wolves and grey wolves. Red wolves are critically endangered.

It's hard to find exact numbers on grey wolf populations in 2020. According to Wikipedia, grey wolf populations were estimated to be 300 thousand in 2003.

I'm curious why implementing the hyperparameter search in Python would have been prohibitively expensive, but wasn't in Hy. For context: I'm familiar with Clojure (which inspired Hy) and macros.

And I would like to know where you learned that sort of meta-programming.

Here's an example of something difficult to do in Python. lazy, stateless and minimize are custom macros.

  (stateless x float)
  (stateless y (* x x))
  (minimize y) // nothing has been calculated yet
  (print y) // 0.0 ― this is where the first calculation occurs
  (print y) // 0.0 ― the second evaluation of y just reads from the cache
  (print x)) // 0.0 ― this is read from the cache too

The stateless macro caches results locally, backs up everything to a remote server in a background process and reads from the remote cache whenever possible.

And I would like to know where you learned that sort of meta-programming.

Any decent Lisp book will cover how to write a macro. The real challenge is knowing what to write, not how to write it.

I know of no good books on this subject. In my experience, you have to understand what it's like to use many different software paradigms and how they are implemented. Then you can just steal their most relevant features as you need them. This particular system took inspiration from Haskell, R and applied mathematics. Under the hood, it makes heavy use of syntax trees, hash-based lookups, lazy evaluation and Bayesian optimization.

How to practice meta-programming commercially is an even harder question. Most companies don't use a meta-enough language like Lisp and those which do may not need meta-software at all. The only place I can think of where this has net positive commercial value would is a tiny startup working on a very hard problem. Small data comes to mind, but not much else.

This is very, very cool. Having come from the functional programming world, I frequently miss these features when doing machine learning in Python, and haven't been able to easily replicate them. I think there's a lot of easy optimization that could happen in day-to-day exploratory machine learning code that bog standard pandas/scikit-learn doesn't do.

This is encouraging to hear. When I talk about this stuff to ML engineers, some instantly get it, especially when they come from a functional programming background. Others don't and it feels like there's a wall between me and them.

I think I can replicate a lot of this in Python, even if it's a little clunky. It's just easier to start in Hy and then write a wrapper to port it to Python.

I know of no good books on this subject. In my experience, you have to understand what it's like to use many different software paradigms and how they are implemented.

Maybe Concepts, Techniques, and Models of Computer Programming? I haven't finished that one, but the first part was good.

Thanks for the detailed reply!

The real challenge is knowing what to write, not how to write it.

Yeah, this is the difficult thing for me. I've written extensions of basic forms like cond. But I haven't yet had an insight like: ‘this is a problem I can solve much more elegantly with macros than with plain functional code’.

Maybe a way to get there would be to dive back into On Lisp (http://www.paulgraham.com/onlisp.html) or Let Over Lambda (https://letoverlambda.com/). Although, if you know of no good books, maybe these don't suffice either. :-)

I liked Chapters 1 and 2 of On Lisp. After that, I felt like it degenerated into a design patterns book. The design patterns Paul Graham need 27 years ago aren't the design patterns I need right now. I prefer Practical Common Lisp as a textbook. Ironically, Practical Common Lisp book is extremely impractical in 2020 but I feel it demonstrates high-level Lisp programming better through its use of extremely dense code.

I've never read Let Over Lambda. Judging by the table of contents, it looks like an exceptionally good book on how to write a macro but—once again—not when to write a macro.

Instead of diving back into your Lisp textbooks, I recommend this advice from Paul Graham's Rarely-Asked Questons:

How can I become really good at Lisp programming?

Write an application big enough that you can make the lower levels into a language layer. Embedded languages (or as they now seem to be called, DSLs) are the essence of Lisp hacking.

I thought if I read enough examples of macros and practice writing powerful ones (not just that custom cond I mentioned), I will start seeing possible applications. You appear to have a different opinion. Anyway, I think I would get your point if I explored more more real world macro-enabled code.

Practical Common Lisp book is extremely impractical in 2020 but I feel it demonstrates high-level Lisp programming better through its use of extremely dense code.

I have read The Joy of Clojure, which imbued me with some good Lisp spirit. And I've moved Practical Common Lisp up on my reading list above On Lisp and Let Over Lambda, thanks to your brief review.

make the lower levels into a language layer

Good reminder. I think I've done something like that here: https://github.com/rmoehn/jursey/blob/master/test/clarification-swallows.repl