System design tips on how to make capacity estimation
Estimation is an important skill in system design interviews because it
allows you to demonstrate your ability to reason about the complexity
and feasibility of a given problem. It is worth noting that estimation is an inexact science and that
it is normal for estimates to be off by some margin. It is more
important to show that you have a clear understanding of the problem and
can provide a thoughtful and well-reasoned estimate than to provide a
perfectly accurate one.
Here are a few examples of the types of estimation questions you might encounter in a system design interview:
How many users can a system support? For example, you might be asked to estimate the number of users that a social media platform can support, based on the number of servers, storage, and bandwidth available.
How much data can a system store or process? For example, you might be asked to estimate the amount of data that a data warehouse can store, based on the size of the data, the number of servers, and the storage capacity of the servers.
How scalable is a system? For example, you might be asked to estimate the performance and scalability of a web application, based on the number of users, the amount of data, and the server resources available.
Cheat sheet:
Byte conversions:
- 1 B = 8bits
- 1 KB = 1000B
- 1 MB = 1000KB
- 1 GB = 1000MB
- 1 TB = 1000 GB
- 1 PB = 1000 TB
Storage scale numbers:
- 1 char = 1 byte
- Metadata (title, description, etc [except images]) ~ 1 - 10 KB
- Image ~ 1-2 MB
- HD video (1 minute) ~ 50 MB
Operations numbers:
- HDD sequential read - 30 MB/s
- SSD sequential read - 1 GB/s
- Memory sequential read - 4 GB/s
SQL databases (numbers are approximate, the purpose is to have general idea about performance):
- Connections: 20 K
- Storage: 50 TB
- Requests: 20 K/s
Cache (numbers are approximate, the purpose is to have general idea about performance):
- Connections: 10 K
- Requests: 100 K/s
- Storage: 300 GB
Web servers (numbers are approximate, the purpose is to have general idea about performance):
- Requests: 5 K/s
Queues (numbers are approximate, the purpose is to have general idea about performance):
- Requests: 3 K/s
- Throughput (writes): 1-50 MB/s
- Throughput (reads): 2-100 MB/s
Calculation example:
10 M photos are uploaded daily to a service.
- 10 (number of millions) * 12 (the number per second for 1 M) = 120 uploads/second.
- 120 (uploads) * 1 MB (size of photo) = 120 MB per second.
The web servers will need to handle a network bandwidth of 120 MB per second.
It's also important to consider the bottlenecks in your system. Just because
your bandwidth can handle a certain number of requests per second
doesn't mean that your entire system can. Some parts of the system that
might struggle to keep up with a high workload include the database
connections or throughput, hard disk reading/writing, the load balancer, and API calls to a
third-party service that has rate limits. Every component of the system
needs to be able to handle the expected demand.
Comments
Post a Comment