Many people ask me why I chose not to pursue a PhD directly after college. I figured I might as well centralize my thoughts in one place. This essay is for computer science or technical grads:

  • Interested in machine learning
  • Enjoy working on hard, unsolved technical challenges
  • Possibly choosing between research or industry

As someone who deeply enjoys thinking about and coding solutions to open-ended research questions, I genuinely considered going to a PhD program. To come to a decision, I spent a while pondering why many students with research experience don’t pursue a PhD directly after undergrad. I think there are many reasons why people don’t go directly to PhD programs, including:

  • Effort: it’s a lot of work to write applications for many schools, get solid recommendation letters, pay a lot of money in application fees, go through interviews, align with professors, secure funding, and more.
  • Comfort: industry perks and salaries can be more attractive than graduate school stipends and long work hours.
  • Inertia: if you’re not absolutely sure it’s right for you, the thought of switching trajectories midway into the program is very anxiety-provoking.

I felt like I could put in the effort to apply and would enjoy a PhD program if I got in, but I still didn’t know what to do. Delaying the decision didn’t feel optimal either, since even if I worked in industry for a few years before going to a PhD program, I didn’t think I’d easily accustom to a large pay cut and working on problem sets again. Understandably, the PhD vs Industry decision caused me a lot of stress! Halfway through my senior year, I wrote down some pros and cons for each choice. In practice, an industry job for me meant working at a startup since I didn’t really enjoy my engineering internships at large companies.

Here, I discuss the raw pros and cons of pursuing a PhD and pros and cons of joining an early-stage machine learning startup. I wrote these in my journal in January 2019, and I provide some relevant commentary from my current self as a machine learning engineer (July 2020) in the nested bullets.

Pros of pursuing a PhD (immediately after graduating)

  • I can prioritize my curiosity and have the freedom to learn about what I find intellectually engaging.
    • Now that I’ve been at a startup, although I can still be curious about many things, I have to work on things that provide business value. The ability to not be constrained by industry value is a real pro here.
  • I can deeply hone technical expertise and have many people respect me for it.
    • I’m cringing at how much external validation I used to seek. Being in computer science is playing the long game. You don’t always get external validation, and good work can take a while.
  • It’s easier to “be my own boss” if I were doing research.
  • I actually really enjoy research more than engineering.
    • Sometimes I still feel that I enjoy research more than engineering, but I’ve committed to developing my engineering skills. Again, being in tech is playing the long game, and having good engineering skills will help me no matter where I go, whether it’s research or IC or management.
  • There are more clear mentorship opportunities for me.
    • This is probably true if I went to a PhD program, since I’d have access to many professors and older grad student friends.
  • I can be in a stable career option (5-6 years at least if doing PhD).
    • This point occasionally gives me anxiety, since I don’t know exactly where I am going to be for the next 5-6 years. Startups are risky. But I’ve gotten more used to this feeling over the past year.

Cons of pursuing a PhD (immediately after graduating)

  • Maybe I’m a shitty researcher or will find out that I’m a shitty researcher.
    • Still a fear of mine.
  • It’s unclear whether I want to end up in academia or industry research, so a PhD could be for nothing.
    • One of my math PhD friends explains a PhD as “a socially acceptable reason for screwing around with whatever you’re interested in.” I like this reference, so technically going to a PhD program isn’t a waste if I enjoy it in the process.
  • Incentives in academia are horrible. There are so many major conferences. How do you balance long term gains with getting the publications out?
    • This is really true. A year ago, I didn’t know how to think long-term. I probably would have tried to maximize my number of papers and burned out in my 3rd year, complaining that I’m not working on anything truly impactful.
  • A PhD can be really long and hard.
    • I love Philip Guo’s The PhD Grind. Unfortunately he recently took his website down, which hosted the book.

Pros of joining an ML startup

  • I can learn how to build things from scratch.
    • Painfully true — building from scratch is challenging, but the growing pains are necessary to deeply understand productionizing ML.
    • At the time, I didn’t know about the existence of “real-world” ML and engineering tools such as Spark, Airflow, Kubernetes, or Docker. These tools are super relevant and learning them can be its own pro.
  • I can see how research is applied in industry and measure the gap between research and industry.
    • This gap is larger than I expected. It’s also really fun to apply state-of-the-art research to business cases. For example, we recently productionized a transformer model.
  • I can gain leadership skills from the beginning and be a key figure in recruiting and defining the culture.
    • I was thrown into the fire as soon as I joined. Now, a year later, I’ve probably done over a hundred technical interviews, led a few ML projects for key clients, and mentored a couple of interns. I would definitely not have gotten this experience at a larger company.
    • It’s also cool to be able to choose who you work with (to some extent) by hiring people that fit in with the culture you help to define.
  • I can closely work with both experienced engineers and non-technical domain experts.
    • I have learned so much from every single one of my coworkers.
  • I am relatively immune to any level of risk this early in my career.
    • This honestly just speaks to my insane privileges of graduating from a top school, having experience at big tech companies, and being a US citizen. I am quite thankful for these things.

Cons of joining an ML startup

  • Possibly being the only woman, potentially for a long time.

    • Bleh. True. No comment.
  • This is not the route to maximize expected earnings. I could get paid more at a larger company. Only in the small chance that the startup succeeds will I make a lot of money.

    • True, but I now feel like this really isn’t a priority for me. Being in tech gives me the privilege of making a high salary wherever I go, and I think that as long as I can live comfortably (as I do now), I’m happy. Maybe this will change if I’m trying to buy a house or have a family.
  • My technical accomplishments aren’t seen outside the startup.

  • I could be overworking myself.

    • I think this is true of any job. There are always opportunities to work more than you’re expected to. Also, as a new grad, optimizing for learning rate can feel like overworking oneself, especially in the beginning.
  • If I don’t care about the specific application, I might not be motivated to do quality work.

    • Surprisingly, I don’t care too much about cars (my company’s application), but I am really, really motivated to solve the technical challenges my company faces.

My story and decision-making process1

Thinking back to the fall of 2018, I remember how stressed I felt drafting my statement of purpose. I’m the kind of person who’s never fully confident in my decisions — there’s always a nagging what if? in my brain. I didn’t feel a magnetic pull towards academia. I also didn’t feel like I was destined to make a huge impact in industry. What does one pursue if they’re just…unsure? I felt this urge to apply to PhD programs — after all, I grew up in a college town, I’m a professor’s daughter, and most of my closest childhood and university friends are in PhD programs.

One of my older friends told me, “If you know what problem you want to solve, then find the place where you’re most likely to help devise a solution.” Robustness in machine learning models is something only academics are working on, I naively thought at the time. So maybe a PhD was the right move. But as I drafted my statement of purpose, writing about how we needed to construct and solve relevant toy problems because it would be imperative to have safe and robust machine learning models in the real world, it seemed silly to me that I wanted to research this field when I had never actually worked on machine learning in the real world. What new perspectives could I bring? Why should a school accept me?

My wonderful undergraduate advisor, Pat Hanrahan, urged me to consider working in industry for a couple years before going to a PhD program. I didn’t like the suggestion at first — what if I would never end up doing a PhD? “Well, then you’ve saved yourself five or six years! If you don’t want to do a PhD, you’ll find out eventually,” he laughed.

Pat had a point, so I started to look for industry machine learning jobs. Fortunately, since I had significant ML research experience for an undergrad, I could pick between many options. Did I want to do machine learning at a large company? Small company? Focus on a specific vertical? I thought back to my machine learning project at Facebook, where I wrote a few lines of SQL to use a well-oiled tool called FBLearner to set up a gradient boosting classifier. I thought about machine learning at Google, where engineers write Borg scripts to throw lots of serialized data at one of many models in a zoo. If I really wanted to know what it was like to work on machine learning when I didn’t have the big company infrastructure at my disposal like most other machine learning practitioners, I realized I needed to join a startup.

A year later, do I regret my decision or wish I had pursued a PhD? Not in the slightest. Working at an early-stage startup is nothing like what I had expected, and I feel incredibly lucky to be one of few people building end-to-end systems for cutting-edge, real-world machine learning applications. On the occasions that I feel frustrated about my job, it’s because the problems are insanely technically challenging and my ass is on the line to deliver results. It could be much worse; I’m so glad my frustrations and energies aren’t directed towards horrible political, sexist, or racist issues that many other people in tech have to deal with. I resonate so much with the idea that the right startup will “ruin your life in the best way possible” — I can’t imagine not having ownership over what I work on, working without teammates who are deeply invested in my well-being, and not being able to wear different hats at my job.

Fall is coming around the corner, which means it’s soon time to apply to PhD programs. I don’t know if I’ll apply this cycle. I don’t know if I’ll apply in future years, either. But I know for sure that if I ever get a PhD, I’ll be much better prepared than I was right after graduation. I’ll know my styles of problem-solving well, and I’ll know to work on issues that matter in the long run. I’m excited for a future where I go back to my favorite toy problems in academia with a renewed focus and perspective on what actually matters in the real world.

Thanks to Emily Ross and Sahaj Garg for feedback on multiple drafts.

Footnotes

  1. This may come off as prescriptive, since I feel positively about my decision. Prescription is not my intention. This essay is just my perspective on the PhD vs Industry debate — if you’re trying to decide, you might want to ask multiple people. There are good reasons to pursue a PhD directly after college. Also, my advice may not be optimal during the COVID-19 pandemic.