System Design Interviews

As I network through tech I frequently hear newcomers ask how to answer system design questions. I’ll try to clear that up in this post.

Why am I being asked to design a parking lot?

If you’re comfortable writing code and work on multi-component projects, then you might be familiar with maximizing reuse without high coupling. In an interview, they will test this with an object oriented design questions. In a service-oriented world, we are now being asked to do this with systems.

These concepts are inconsistently taught across all programs. Unlike a for loop, interviewers can’t always expect a candidate to be familiar with systems and services. Unfortunately, the interviewers are given a checklist and if it contains system design, they ask it. More bad news: the recruiters aren’t told what’s on the list. As a result, you and your interviewer have different ideas about what’s going to be asked.

How To Learn This

Question and Answer Format

Question

Questions will be phrased as: “design a system to XYZ”, “design a pre-existing software system”, “draw the architecture for XYZ”. Examples:

  • Existing software: Uber, Lyft, Facebook Messenger, Amazon retail website, Google Search, DNS, Expedia, Docker, GitHub, WordPress, any website you can think of.
  • Interview problems: Parking garage, generic search engine, inventory management system, hotel guest management system, log aggregation

Anything can come up here depending on how prepared or creative the interviewer is.

Boxes and Lines

The answer is going to be boxes and lines.

Boxes_and_Lines

Approaching the Question

Clarifying Questions and Requirements

If you’ve done any preparation or any interviews, you will know you need to ask clarifying questions and get the requirements. Treat your unimplemented solution as a black box and try to describe the inputs and the outputs. Take Twitter for example:

  • Are tweets text only?
  • How will customers get data out of the system? Browser, phone, REST API?
  • Can a customer send one tweet at a time or many?
  • Do users have accounts or is this anonymous?
  • Is user following enabled?
  • Are there multiple feeds or one giant aggregated feed of tweets?

You might be thinking “Do you even know how Twitter works?” If you are, good. You need to make sure you and the interviewer are on the same page. You can’t come up with a design that has every current Twitter feature so make sure to scope it down to a few specific features. Saying “I’m going to implement user accounts, tweeting text only, and retrieving messages without the follower system as my first round” is extremely helpful. The interviewer will know that you are aware that there are more features and are choosing to defer than rather than forgetting them. This also makes the problem easier. If you don’t specify which aspects you are implementing, you and your interviewer will have different ideas about a good answer.

Note: These interviews are 45 to 60 min. It’s better to start small and then discuss enlarging the scope with the interviewer than the other way around.

Always identify the data.

The data tells you a lot about the design. What data is being sent? What data is being stored? Does the data need to be sorted? How quickly does the data need to be available? This provides another way to get requirements. Example:

  • How will data be searched? Maybe we start with hashtag search only.
  • Can anyone see all tweets or are some tweets restricted viewing? We can start with all public.
  • When we search for tweets, do we return all of them that match or only the top 100 most recent? Let’s do all of them at first for simplicity.

The Boxes

What are the boxes? Don’t be thrifty on the boxes. Boxes are more easily grouped than separated. You can do a “discovery” style and follow a single piece of data through the system.

For example, a customer hits the “Tweet” button from their browser. This goes to some server, the TweetManager. The tweet needs to be stored and show up in the user’s page and the global feed (as per requirements of no custom feeds). The tweet also needs to be searched later by hashtags (as per requirement of hashtag search only). This tells us we need a place to store the whole tweet, TweetStore. We might also need to store user information that can reference their tweets, UserStore and maybe UserManager. Somewhere else, we need to support hashtag searches, SearchTagManager and/or HashTagStore. Finally, we can throw in notifications for fun and have  NotificationManager. Here’s what that looks like:

Twitlines

  1. The user tweets and the tweet is sent to the TweetManager (here we assume that the user is logged in and is who their cookies or headers say they are).
  2. The TweetManager stores the Tweet in the TweetStore. At this point it is linked to the user that posted it.
  3. The TweetManager sends the hashtags in the tweet along with a reference to the tweet in the TweetStore.
  4. The HashtagManager adds the tweet reference to the list of all tweets related to that hashtag in the hashtag store (this creates a link to the tweet store).
  5. The TweetManager sends a completed message to the notifier which then shows everyone logged in that a new tweet has been posted.

We also have Login and ‘A’ which is searching by hashtag.

Note: This is a simplified example. I would strongly discourage using this answer in an interview because it doesn’t scale, has high latency, and will likely result in data inconsistency.

Hey, Where Are My Tables?

You may be tempted to say one of your boxes will be a SQL database and start describing your schema in detail. Don’t. Don’t do this unless the interviewer asks you to go into details. If you focus on details, you won’t have time to answer the whole question. However, having them at the ready doesn’t hurt in follow up questions.

Grouping

It turns out I have 3 storage boxes and 3 service boxes. Do they all need to be separate or can some be together? There are many driving factors for doing this:

  • Minimize coupling: you should minimize lines between two boxes for a single “trip” through the system
  • Follow the K.I.S.S. rule: you should not have to fan out to turn one tweet into 5 calls to other boxes
  • Asynchronous vs. Synchronous: know which lines need to be blocking and which can be done asynchronously. In some systems, the different types of calls go to different boxes (not always).

A simple grouping might be all the logic in one box and all the data in another:

Group

Scalability

Scalability is challenging. In these questions, you can either predict how to scale the system or you can do this exercise repeatedly:

  1. A significant political event is happening. A few key individuals are sending out tweets through the event. Everyone on the planet is trying to search for tweets with the event hashtag at the same time.
  2. Identify what will break first and how it will break. If it breaks, does it break only the overloaded part of the system or other parts too? In this case, the hashtag manager can actually be scaled to tolerate the number of requests. However, the data store (no matter what store you use) will be in trouble.
  3. Discuss or draw a way to solve this. The most common answers to this are asynchronous communication with queuesreplicas, and some form of sharding.

Often you need to do one or two examples to show you understand what scaling is and how to alter a system to support more traffic.

New Features

You’re feeling great because you drew boxes and lines. You came up with your own scaling answers before the interviewer asked. Then the interviewer throws a giant wrench into your answer: I want you to implement followers, identity verification, and malicious account detection.

If you only have 5 minutes left, feel free to verbally explain while waving your hands at your diagram. Throw a quick machine learning trained fraud box and a service that requests identity verification through a third party vendor. Usually when these features are requested near the end of the interview it isn’t about implementing them but about discussing how you can grow your solution. If you can’t discuss the solution, it might look like you copied it from a book.

Follow Up Questions

  • Why is this data here and who can access it?
  • How long does it take to complete one customer request?
  • Why are these two components separate or together?
  • Can you make this more generic or into a platform? How would you covert this into a Software as a Service product?

I Did All The Right Things And All I Got Was This T-Shirt

I’ve had too many interviews where the interviewer was stubborn, critical, condescending, dismissive, and basically an asshole. It’s almost impossible to succeed in these cases unless you fit their mold perfectly and even then it’s not a guarantee. Here are some signs that it might not be your answer that’s at fault:

  • Your interviewer is flipping between high level criteria and specific technologies (i.e. we need a customer service but please use AWS lambda)
  • Your interviewer is telling you where to draw your lines without explaining why
  • Your interviewer is repeatedly cutting you off when you try to ask clarifying questions and replies with “just implement it the way it is”
  • Your interviewer says “uhhh” a lot and doesn’t seem to know where he or she is going with the question
  • Your interviewer changes the question halfway through: did I say architecture? I meant library
  • You were incorrectly leveled by your recruiter (surprise!) and you feel deeply in over your head.

Leveling?

All companies put their applicants into buckets that are highly variable but sort of follow this structure:

  • 0 to 2 years of experience: entry level
  • 2 to 5 years of experience: mid level
  • 5+ years of experience: senior level
  • 12+ years of experience: principle or CTO level

Highly variable by company, geographic area, job role, and how the recruiter feels that day.

I think it’s unreasonable to ask an entry level applicant a design question because they may or may not have encountered them. If an applicant is on a border of experience levels, they may ask this question to see which bucket they fit in. Finally, as above, recruiters and interviewers don’t talk.

In Conclusion

Just don’t draw a single box with a SQL database schema as your answer. Good luck!

 

Defensive Interviewing

If you’ve interviewed anywhere in tech, you’ll hear the advice or instructions to have questions ready for your interviewers. Which questions do you ask though?

Hopes and Fears

Everything boils down to what you want to happen and what you don’t want to happen.

Hopes

  • Belonging
  • Achievement
  • Trust
  • Growth
  • Variety
  • Money – this one is salary negotiation so I’ll skip it

Fears

  • Exploitation
  • Rejection and isolation
  • Boredom
  • Stress

Now that we’ve got the heavy stuff out of the way, how does that translate into interview questions?

Belonging / Rejection and Isolation

A sense of belonging contributes to happiness. A sense of happiness contributes to productivity. Thus, you will be more successful if you feel like you belong. Even when your job is really bad, if you feel like “you’re all in it together”, it’s easier to pull through.

Questions:

  • What is the diversity of your team?
  • Are there people like me on the team?
  • Who will be my mentor when I join?
  • What are some social activities we will do as a team?
  • What communities for technical and non-technical topics exist at the company?
  • Do you feel like you could be friends with some of the people you work with if they weren’t your coworkers?
  • How often will I have 1 on 1s with my manager?

Scenario one: a diverse group of people who welcome new members with an automatic support network of a mentor and bond through shared interests. Scenario two: a monochromatic team of humorless people you can’t identify with that leave you to struggle alone and generally don’t talk to each other. Take your pick.

Achievement, Growth, and Variety / Boredom

Boredom is bad. Boredom is similar to stress. You disassociate and (eventually) become depressed or destructive. Work that slightly exceeds your skill set is ideal for maximum engagement and learning (according to Emotional Intelligence by Daniel Goleman). You can keep things interesting through promotions, new skills, or role changes. Additionally, if you want to climb the ladder, make sure there are at least a few rungs.

Questions:

  • What does promotion look like here?
  • How long does someone like me stay in this role before being considered for promotion?
  • Do you support 20% time or time to grow professionally via hackathons, conferences, or tech talks?
  • Does this company encourage moving between teams if there are other opportunities available?
  • Does this company support role changes and what does a successful role change look like?

Trust / Exploitation

People typically know when something they are going to say will put people off. The managers and recruiters of the world know this and choose to omit or mislead when it comes to that information. Instead of trying to catch them in a lie, probe to fill out the truthiness of their answers. This was taught to me as “peeling the onion”. In this metaphor, the more you peel the onion, you might get more onion or, I don’t know, a radish.

Questions:

  • What tasks are you working on right now? Ask for specifics.
  • What would you say is the key success criteria for your job? Why?
  • What is an example of the first project (not task) I will be working on? How is this important to our customer?
  • How involved will I be in designing new features and choosing team priorities? How often will I get a chance to influence project direction?
  • How many people with my role are on the team? (the more there are, the more reliable their answers)
  • If I am interested in working on something in particular, how would I go about getting assigned to the project? Give me an example of when you did this.
  • When something goes wrong, what is the recovery? Maybe a bug is pushed or a customer says the feature was done wrongly or a service goes down. Is there a retrospective? Does it get fixed right away?
  • Who is responsible for operations, customer contact, and project planning?

This is probably the hardest one to detect. Often, teams want to hire someone to do the housekeeping, like bug fixes, legacy maintenance, mindless migration, and minor management activities. You need to ask questions to confirm there is enough “meaty” work for you and housekeeping is spread evenly or kept to a minimum.

General Red Flags

  • Your manager has been in the company or sub-org for less than 6 months. This usually means they haven’t been through a performance cycle and there is a risk that they aren’t sure what it looks like for you to do a good job. If you don’t know how to do a good job, you might not be rewarded for the work you do. However, after about a year to a year and a half, most managers figure it out.
  • You are being hired for a “generic” position. This is basically job roulette. It’s worked out well for me in the past but it’s also opportunity for you to be placed where no one else wants to be.
  • There are a lot of buzzwords. If something sounds good but doesn’t tell you anything specific, they might be trying to hide something. “We do machine learning” is the equivalent of saying “we develop software”. It generates excitement but doesn’t tell you that you’ll actually be a code monkey for the scientists who do the “real” machine learning.
  • “We have no operations.” This is very job dependent. I’m talking about services, cloud, and larger software applications. If you have no ops, you have no usage or no customers. On the other side, you might have a lot of ops but someone else deals with it. This is an organizational anti-pattern and guarantees someone will strongly dislike you on that ops team. Not fun.
  • “We are a rapidly growing team.” This can genuinely be exciting if you are joining a team of smart and capable people coming together to create a new great thing. Or this could mean the managers are throwing bodies at a problem in such a way that creates stress, confusion, and general unhappiness.

It’s Too Late

If you’ve found yourself in a job where it didn’t live up to your expectation, first, figure out which questions to ask next time you interview. Second, tell someone it wasn’t what you expected and firmly ask to be placed somewhere that meets those expectations. Third, as soon as you can, decide whether you want to stay or go. Be intentional about what job you are choosing to do. By taking responsibility, you give yourself control over your situation and who doesn’t like control?

Good luck interviewing!