How to design tiny url system

This post is a personal cheatsheet on how to design tiny url system.


Questions to ask:


  1. What’s the length of short url?

  2. How many urls will come?

  3. What duration to support?

  4. What chars in url are allowed?


Basic calculation:


Let’s assume that we are allowed to use chars  a-z, A-Z, 0-9 (62 chars in total). Url with length of 1 char means we are able to generate 62 unique short urls, 2 chars - \(62^2\) urls, 3 chars - \(62^3\), etc…


Let’s assume that we get x requests per second and the business requirement is to store the short versions of urls for 10 years. We get following equation:


\(x\) requests per second * 60 sec * 60 mins * 24 hours * 365 days * 10 years = \(y\)


We get following formula from equation:


\(62^n > y => n = log_{62}(y)\), where n is the length of short url.


n = 7 looks like a decent amount of unique urls (\(62^7\) = 3.52 trillions)


System capacity:

Let’s assume that we have have read/write ratio 200:1, number of generated links per month - 100 million.

Write load: 100 million / (30 days * 24 hours * 60 mins * 60 sec) ~ 40 requests per second.

Read load: 40 * 200 = 8000 requests per second.

Storage:  assuming that service lifetime is 10 years: 100 million * 12 month = 12 billion. If object size is 500 bytes (short url, long url, create_at etc), then required storage is 12 billion * 500 bytes = 6 TB.

Memory:  following Pareto principle (80% requests are for 20% of data) since we get 8000 requests per second, we will be getting 700 million requests per day (8000 requests * 24 hours * 60 mins * 60 sec). To cache 20% of requests we need 70 GB (0.2 * 700 million * 500 bytes) 

Non-functional requirements:


  • Very low latency

  • Very high availability


Functional requirements:


  • Get short url

  • Redirect to long url


Basic design:

Basic design

  • User interface - we take long url as an input and try to convert it into tiny url.
  • LB - load balancer.
  • Short url service - service that creates short unique urls.
  • Cassandra - database where we store urls.


How “short url” service should work under the hood:


General idea: generate random unique number and convert it from base 10 to base 62. 


But how to generate unique numbers and handle collisions?


Iteration 1: generate random number and retry generation if collision happens - not efficient :( 


Iteration 2: generate unique number using redis cluster incrementor.



Design (iteration 2)


Issues with solution: we use only one redis, hence system is not highly available. In case if to use multiple redises - we will start generating duplicates.


Iteration 3 (final): create token service that orchestrates unique number ranges, then Short url services use requested ranges and increment value.


Design (iteration 3)


Issues with solution and how to handle them:


  • Incremented value is out of given range - then just request next range.

  • We get new bottleneck - we can increase the given range in order to decrease the service load, also we can have multiple instances of token service and use leader election, we can locate token services across different regions depending on our load.

  • What if short url service shuts down (incrementing is in memory) - just request next range, it’s okay if we loose them, just thousand out of 3.5 T tokens.


Lessons learned:


Never create a single point of failure.


Resources:


TinyURL System Design | Bitly System Design


System Design : Scalable URL shortener service like TinyURL


Design URL Shortening service like TinyURL


URL shortener System design


Comments

Popular posts from this blog

System design tips on how to make capacity estimation

IMAP protocol summary