How to design tiny url system
This post is a personal cheatsheet on how to design tiny url system.
Questions to ask:
What’s the length of short url?
How many urls will come?
What duration to support?
What chars in url are allowed?
Basic calculation:
Let’s assume that we are allowed to use chars a-z, A-Z, 0-9 (62 chars in total). Url with length of 1 char means we are able to generate 62 unique short urls, 2 chars - \(62^2\) urls, 3 chars - \(62^3\), etc…
Let’s assume that we get x requests per second and the business requirement is to store the short versions of urls for 10 years. We get following equation:
\(x\) requests per second * 60 sec * 60 mins * 24 hours * 365 days * 10 years = \(y\)
We get following formula from equation:
\(62^n > y => n = log_{62}(y)\), where n is the length of short url.
n = 7 looks like a decent amount of unique urls (\(62^7\) = 3.52 trillions)
System capacity:
Non-functional requirements:
Very low latency
Very high availability
Functional requirements:
Get short url
Redirect to long url
Basic design:
Basic design |
- User interface - we take long url as an input and try to convert it into tiny url.
- LB - load balancer.
- Short url service - service that creates short unique urls.
- Cassandra - database where we store urls.
How “short url” service should work under the hood:
General idea: generate random unique number and convert it from base 10 to base 62.
But how to generate unique numbers and handle collisions?
Iteration 1: generate random number and retry generation if collision happens - not efficient :(
Iteration 2: generate unique number using redis cluster incrementor.
Design (iteration 2) |
Issues with solution: we use only one redis, hence system is not highly available. In case if to use multiple redises - we will start generating duplicates.
Iteration 3 (final): create token service that orchestrates unique number ranges, then Short url services use requested ranges and increment value.
Design (iteration 3) |
Issues with solution and how to handle them:
Incremented value is out of given range - then just request next range.
We get new bottleneck - we can increase the given range in order to decrease the service load, also we can have multiple instances of token service and use leader election, we can locate token services across different regions depending on our load.
What if short url service shuts down (incrementing is in memory) - just request next range, it’s okay if we loose them, just thousand out of 3.5 T tokens.
Lessons learned:
Never create a single point of failure.
Resources:
TinyURL System Design | Bitly System Design
System Design : Scalable URL shortener service like TinyURL
Design URL Shortening service like TinyURL
Comments
Post a Comment