System Design Interviews

As I network through tech I frequently hear newcomers ask how to answer system design questions. I’ll try to clear that up in this post.

Why am I being asked to design a parking lot?

If you’re comfortable writing code and work on multi-component projects, then you might be familiar with maximizing reuse without high coupling. In an interview, they will test this with an object oriented design questions. In a service-oriented world, we are now being asked to do this with systems.

These concepts are inconsistently taught across all programs. Unlike a for loop, interviewers can’t always expect a candidate to be familiar with systems and services. Unfortunately, the interviewers are given a checklist and if it contains system design, they ask it. More bad news: the recruiters aren’t told what’s on the list. As a result, you and your interviewer have different ideas about what’s going to be asked.

How To Learn This

Question and Answer Format

Question

Questions will be phrased as: “design a system to XYZ”, “design a pre-existing software system”, “draw the architecture for XYZ”. Examples:

  • Existing software: Uber, Lyft, Facebook Messenger, Amazon retail website, Google Search, DNS, Expedia, Docker, GitHub, WordPress, any website you can think of.
  • Interview problems: Parking garage, generic search engine, inventory management system, hotel guest management system, log aggregation

Anything can come up here depending on how prepared or creative the interviewer is.

Boxes and Lines

The answer is going to be boxes and lines.

Boxes_and_Lines

Approaching the Question

Clarifying Questions and Requirements

If you’ve done any preparation or any interviews, you will know you need to ask clarifying questions and get the requirements. Treat your unimplemented solution as a black box and try to describe the inputs and the outputs. Take Twitter for example:

  • Are tweets text only?
  • How will customers get data out of the system? Browser, phone, REST API?
  • Can a customer send one tweet at a time or many?
  • Do users have accounts or is this anonymous?
  • Is user following enabled?
  • Are there multiple feeds or one giant aggregated feed of tweets?

You might be thinking “Do you even know how Twitter works?” If you are, good. You need to make sure you and the interviewer are on the same page. You can’t come up with a design that has every current Twitter feature so make sure to scope it down to a few specific features. Saying “I’m going to implement user accounts, tweeting text only, and retrieving messages without the follower system as my first round” is extremely helpful. The interviewer will know that you are aware that there are more features and are choosing to defer than rather than forgetting them. This also makes the problem easier. If you don’t specify which aspects you are implementing, you and your interviewer will have different ideas about a good answer.

Note: These interviews are 45 to 60 min. It’s better to start small and then discuss enlarging the scope with the interviewer than the other way around.

Always identify the data.

The data tells you a lot about the design. What data is being sent? What data is being stored? Does the data need to be sorted? How quickly does the data need to be available? This provides another way to get requirements. Example:

  • How will data be searched? Maybe we start with hashtag search only.
  • Can anyone see all tweets or are some tweets restricted viewing? We can start with all public.
  • When we search for tweets, do we return all of them that match or only the top 100 most recent? Let’s do all of them at first for simplicity.

The Boxes

What are the boxes? Don’t be thrifty on the boxes. Boxes are more easily grouped than separated. You can do a “discovery” style and follow a single piece of data through the system.

For example, a customer hits the “Tweet” button from their browser. This goes to some server, the TweetManager. The tweet needs to be stored and show up in the user’s page and the global feed (as per requirements of no custom feeds). The tweet also needs to be searched later by hashtags (as per requirement of hashtag search only). This tells us we need a place to store the whole tweet, TweetStore. We might also need to store user information that can reference their tweets, UserStore and maybe UserManager. Somewhere else, we need to support hashtag searches, SearchTagManager and/or HashTagStore. Finally, we can throw in notifications for fun and have  NotificationManager. Here’s what that looks like:

Twitlines

  1. The user tweets and the tweet is sent to the TweetManager (here we assume that the user is logged in and is who their cookies or headers say they are).
  2. The TweetManager stores the Tweet in the TweetStore. At this point it is linked to the user that posted it.
  3. The TweetManager sends the hashtags in the tweet along with a reference to the tweet in the TweetStore.
  4. The HashtagManager adds the tweet reference to the list of all tweets related to that hashtag in the hashtag store (this creates a link to the tweet store).
  5. The TweetManager sends a completed message to the notifier which then shows everyone logged in that a new tweet has been posted.

We also have Login and ‘A’ which is searching by hashtag.

Note: This is a simplified example. I would strongly discourage using this answer in an interview because it doesn’t scale, has high latency, and will likely result in data inconsistency.

Hey, Where Are My Tables?

You may be tempted to say one of your boxes will be a SQL database and start describing your schema in detail. Don’t. Don’t do this unless the interviewer asks you to go into details. If you focus on details, you won’t have time to answer the whole question. However, having them at the ready doesn’t hurt in follow up questions.

Grouping

It turns out I have 3 storage boxes and 3 service boxes. Do they all need to be separate or can some be together? There are many driving factors for doing this:

  • Minimize coupling: you should minimize lines between two boxes for a single “trip” through the system
  • Follow the K.I.S.S. rule: you should not have to fan out to turn one tweet into 5 calls to other boxes
  • Asynchronous vs. Synchronous: know which lines need to be blocking and which can be done asynchronously. In some systems, the different types of calls go to different boxes (not always).

A simple grouping might be all the logic in one box and all the data in another:

Group

Scalability

Scalability is challenging. In these questions, you can either predict how to scale the system or you can do this exercise repeatedly:

  1. A significant political event is happening. A few key individuals are sending out tweets through the event. Everyone on the planet is trying to search for tweets with the event hashtag at the same time.
  2. Identify what will break first and how it will break. If it breaks, does it break only the overloaded part of the system or other parts too? In this case, the hashtag manager can actually be scaled to tolerate the number of requests. However, the data store (no matter what store you use) will be in trouble.
  3. Discuss or draw a way to solve this. The most common answers to this are asynchronous communication with queuesreplicas, and some form of sharding.

Often you need to do one or two examples to show you understand what scaling is and how to alter a system to support more traffic.

New Features

You’re feeling great because you drew boxes and lines. You came up with your own scaling answers before the interviewer asked. Then the interviewer throws a giant wrench into your answer: I want you to implement followers, identity verification, and malicious account detection.

If you only have 5 minutes left, feel free to verbally explain while waving your hands at your diagram. Throw a quick machine learning trained fraud box and a service that requests identity verification through a third party vendor. Usually when these features are requested near the end of the interview it isn’t about implementing them but about discussing how you can grow your solution. If you can’t discuss the solution, it might look like you copied it from a book.

Follow Up Questions

  • Why is this data here and who can access it?
  • How long does it take to complete one customer request?
  • Why are these two components separate or together?
  • Can you make this more generic or into a platform? How would you covert this into a Software as a Service product?

I Did All The Right Things And All I Got Was This T-Shirt

I’ve had too many interviews where the interviewer was stubborn, critical, condescending, dismissive, and basically an asshole. It’s almost impossible to succeed in these cases unless you fit their mold perfectly and even then it’s not a guarantee. Here are some signs that it might not be your answer that’s at fault:

  • Your interviewer is flipping between high level criteria and specific technologies (i.e. we need a customer service but please use AWS lambda)
  • Your interviewer is telling you where to draw your lines without explaining why
  • Your interviewer is repeatedly cutting you off when you try to ask clarifying questions and replies with “just implement it the way it is”
  • Your interviewer says “uhhh” a lot and doesn’t seem to know where he or she is going with the question
  • Your interviewer changes the question halfway through: did I say architecture? I meant library
  • You were incorrectly leveled by your recruiter (surprise!) and you feel deeply in over your head.

Leveling?

All companies put their applicants into buckets that are highly variable but sort of follow this structure:

  • 0 to 2 years of experience: entry level
  • 2 to 5 years of experience: mid level
  • 5+ years of experience: senior level
  • 12+ years of experience: principle or CTO level

Highly variable by company, geographic area, job role, and how the recruiter feels that day.

I think it’s unreasonable to ask an entry level applicant a design question because they may or may not have encountered them. If an applicant is on a border of experience levels, they may ask this question to see which bucket they fit in. Finally, as above, recruiters and interviewers don’t talk.

In Conclusion

Just don’t draw a single box with a SQL database schema as your answer. Good luck!

 

Course Review: Docker for Java Developers

This post will go over a course Lynda.com (company owned by LinkedIn and by extension Microsoft) to learn about Docker for Java Developers.

Lynda.com Course

Docker

Java 

Usefulness And Overview

Currently, the course topic is relevant. The paradigm of “containerization” or releasing your software as self-contained collections of related packages and dependencies called “containers” is catching on quickly across services in the industry. Even though this says it’s for Java developers, it’s not really Java specific. All the concepts and commands used are language independent to a certain extent. The part that the course missed out on was Kubernetes, a fast growing solution from Google related to container management.

Is this particular course a good use of your time to learn about Docker? Maybe. A lot of the content was easily found in documentation or by searching online. If you like information presented in sequence with context, yes, this is a good choice. Otherwise, it may be tedious or too shallow in topic coverage.

The course follows a mini-lecture with demo format. You can copy the course materials and follow along with the demo. The course starts off assuming you don’t have Docker set up. The content begins with installation and follows a simple web app through containerization, deployment, release and scaling. It further goes through monitoring options and maintenance commands.

Course Details

  • The instructor introduces Docker by showing you the download websites and how to install on various operating systems.
  • He introduces the course material by showing how to use git to clone the course materials and use them.
  • The first use of Docker is to create a container with the sample application and use the start and stop commands along with options. List running containers as well.
  • Next, the website is deployed using the container and various health checks are shown. An important not here was how container health is different than application health.
  • The lecture shows how to automate the use of containers in a build and release flow.
  • Container sharing, tagging, and maintenance in a container store are shown along with best practices for tagging.
  • Next was a more complex application with multiple services with a container that needed to be started up in a particular order (application and database).
  • He went over the use of container contexts to allow running multiple instances of a container on the same host.
  • This then moved into more advanced use of containers including swarm mode with rolling updates, certificate rotation, auto-scaling, and fail over.
  • He went over container maintenance and use of the master node to manage other nodes in the cluster including the use of drain and pause commands.
  • Another advanced topic covered was storage nodes and how to use container independent storage or distributed storage solutions with containers.
  • As the last topic, he went over tools and other plugins for monitoring including the stats CLI tool, Prometheus and C Advisor.
  • He did not go over Kubernetes but recommended it as a future topic.

Lightning Talk: Mindfulness To Find Your Dream Job

I did a 5 minute lightning talk at a women in tech conference. Here’s the blog version.

20180925_093905
Measuring Your Heart

Everything that irritates us about others can lead us to an understanding of ourselves.

Carl Jung

As Carl Jung points out, we can learn a lot about our likes and dislikes by paying attention to the things that irritate us. That is more or less how this works.

3 Simple Steps

Step 1: Collect

Before we can answer any questions about what we like or dislike at work, we need to collect data. According to The Paradox Of Choice, a book that explore our biases when remembering experiences and making choices, we judge whether we like an experience based on our feelings at the end. If you had a mostly bad day at work but the last hour or two were great, you might think you had a great day. For that reason, I recommend using mindfulness to collect data based on small tasks or events in your day rather than trying to decide whether you like your work at the end of the day, week, or month.

How does this work?

Trigger

Building habits is hard. According to The Power of Habit, the best way to build a habit is to associate it with an existing trigger. For example, your trigger might be checking your phone or going to the bathroom. Every time you do this, take a second to use the mindfulness technique to record data about your feelings about your job.

Mindfulness

If you’re not familiar with mindfulness, don’t worry: this is really tiny aspect used as a focus tool. First, you need to remove distractions. I use physical sensation to draw attention to right now. Hold fabric between your fingers and rub them together to really pay attention to the texture of the cloth. You can draw one finger along the inside of another finger to generate a sensation that grabs your attention. Bringing focus to a physical sensation is all you need to temporarily dislodge yourself from the barrage of thoughts about everything else but now.

Once you have your attention, do a “body scan”. This is reading your own body language. Are your shoulders tensed or relaxed? Are you breathing slowly and deeply or quickly and shallow? Are you fidgeting or balling your hands in fists? A lot of these little things are easily noticed if you take a second to pay attention and tell you how you’re feeling.

Record

Each “record” should be a pair: what were you doing and how did you feel after. These can be as detailed or sparse as you want. As you repeat the exercise, you will be able to adjust according to what data is most useful to you.

Examples:

  • One on one with manager: happy, relaxed, confident
  • Meeting with stakeholder: tense, crossed arms, needed to take a walk
  • Publishing code review: godlike
  • Release war room: why do I do this job?

Step 2: Categorize

Next we categorize the data. There can be 2 or more categories and they can be whatever you want. My favorite is “good vs. bad” but other useful ones are “stressful vs. calming”, “energizing vs. draining”, or “empowering vs. demotivating”. Depending on what you want to change or understand, you can adjust your categories. This technique can be used to sort your activities into groups like “helps promotion vs. busy work” or “builds skills vs. menial tasks”. These can be used to stay on track for career goals.

Example:

Good

  • Publishing code review
  • Figuring out root cause of bug
  • Successful release to production
  • One on one with manager

Bad

  • Team retrospective
  • Meetings with stakeholders
  • Writing integration test for legacy features
  • Release war room

Step 3: Interpret

Finally, figuring out your dream. I can’t promise this will get you the best job in your next career change but if you do this regularly, it will make you more aware of what to change now and look for in the future. How does that work?

From the example above I can see a few trends:

  • I tend not to like meetings
  • I have a good relationship with my manager
  • I enjoy releasing code and moving code along in the development process
  • I enjoy solving problems
  • I don’t like being in high stress situations like war rooms or situations that may be otherwise delicate like retrospectives
  • I tend to prefer solo tasks
  • It looks like I prefer smaller meeting sizes
  • I might have a good relationship with my manager but not my team based on the retrospective being in the “bad” column
  • I might not like partner teams if the war room and stakeholder meeting both fell under bad
  • I probably like our development infrastructure since I liked publishing my code review and releasing my code

If you see the complex ones with “might” and “probably”, you might need better data around those events.

Now, I have this blurb to put on my LinkedIn profile:

I am looking for a position that values independent workers who work closely with their core teams. I enjoy working for managers who empower their engineers to stay focused on their project work. I prefer written communication to meetings and I’m strongly in favor of remote work. I am passionate about devops and development process excellence. I gain great satisfaction from a job were I can problem solve when digging into the root cause of issues.

It sounds like my likes and dislikes at work make me a perfect devops, quality, or infrastructure engineer on a remote team that values independent workers. When I first did this exercise and saw this data, I was on a team that prioritized frequent collaboration across multiple teams and mandated feature development over process or product improvement. This might explain why I wasn’t so happy there.

This also leads to key terms for a job search:

Independent, single manager or fewer managers, written communication, devops, operational excellence, remote, debugging, quality

Here’s an random job for a Remote Security Engineer at Elasticsearch. Let’s see how many of those traits I can find (I’ve bolded the relevant parts):

Engineering Philosophy

Engineering a distributed system that is easy to operate via elegantly designed APIs is a challenge. It requires software development skills and the ability to think like a user. We care deeply about giving you ownership of what you’re working on [Independence]. Our company believes we achieve greatness when they are set free and are surrounded and challenged by their peers. At Elastic, we effectively don’t have a hierarchy to speak of [Less multi-manager meetings]; we feel that you should be empowered to comment on anything, regardless of your role within the company.

What You Will Be Doing:

  • Evolving the security features of Elasticsearch.
  • Implement authentication, authorization, and other security protocols within Elasticsearch.
  • Build the foundation of security for the Elastic Stack using knowledge of cryptographic primitives and security trade-offs.
  • Prototype new ideas and experiment openly.
  • Collaborating in the open with the Elasticsearch team, Elastic Stack users, and others supporting open source projects.
  • Working with the community on bugs and performance issues and assisting out support engineers with tougher customer issues. [Debugging]

Tally this up: remote, independent, few multi-manager meetings, quality (comes with security), and debugging with customers. This basically meets everything but the devops requirement. Before I did this exercise, I wouldn’t have looked for or considered this job. It looks like a much better fit for my likes and dislikes than my job at the time was.

Finally, you don’t actually need to leave your current job to “find” your dream job. If you bring this data to your manager, you can have a conversation to improve your current day to day work.

Examples:

  • Hi Manager, I really enjoying improving and augmenting our development infrastructure. Is there any bandwidth for me to spend more time on tasks like this?
  • Dear Manager, I find the stakeholder and war room meetings with Team X are very chaotic and distracting. Do you think you could help me push for a conference call so I don’t need to be in the room and be less distracted?
  • To the Manager whom it may concern, I understand that you’ve been placing me in leadership positions for several new products. While I think this is a great compliment for the trust you have in me, I want to work with you to make time for doing what I love at this job: crushing bugs and solving problems.
  • Meetings suck. Please make them stop.

How you phrase these has more to do with Crucial Conversations than anything else. At the very least, you communicate what you want more of or less of.

Brush Twice, Floss Once

How often should you do this? I recommend 5 to 10 consecutive business days with a handful of measurements per day to get a good sense of your average work day. Be careful of the time frame you choose. If another significant life event is going on or something else is changing, you may be measuring your reaction to that other thing instead of your reaction to your job.

Tools

Tools for setting triggers:

  • Phone alarms
  • Calendar reminders
  • Apps like Dailio

Tools for measuring:

  • Coloring or tagging your calendar meetings with categories describing your reactions to them
  • Apps like Dailio
  • Pen and Paper

Tools for interpreting:

  • Pen and paper
  • Apps, once again, like Dailio

Happy Self Quantifying.