System Design : Scalable URL shortener service like TinyURL

Sandeep Verma
21 min readMay 5, 2021

The goal of this article is to discuss system design of services like bitly.com , tinyurl , t.c o etc. Innumerable articles are present on Google for this topic and they cover various algorithms employed for url service design in great detail.However, they lack in-depth description of various system design choices. My goal in this article will be to fill those missing parts.

1. URL Shortener service

A URL shortener service creates a short url/aliases/tiny url against a long url.Moreover, when user click on the tiny url, he gets redirected to original url.

Tiny url are exceedingly handy to share through sms/tweets (where there is limit to number of characters that can be messaged/tweeted) and also when they are printed in books/magazines etc.(Less character implies less printing cost). In addition, it is easy and less error prone to type a short url when compared to its longer version.

For e.g. I have created a tiny url to my medium profile using a free short link generation service called tinyrl.com (Try clicking shortened link and also try generating few short urls on your own at tinyurl.com)

Long URL: https://medium.com/@sandeep4.verma
Short URL : https://tinyurl.com/3sh2ps6v

2. Features

There can be myriad of features even in seemingly small service like URL shortening. Hence it’s always a good idea to ask interviewer what features are expected in the service. This will then become the basis for functional requirements.

How long a tiny url would be ? Will it ever expire ?

Assume once a url created it will remain forever in system.

Can a customer create a tiny url of his/her choice or will it always be service generated ? If user is allowed to create customer shortened links, what would be the maximum size of custom url ?

Yes user can create a tiny url of his/her choice. Assume maximum character limit to be 16.

How many url shortening request expected per month ?

Assume 100 million new URL shortenings per month

Do we expect service to provide metrics like most visited links ?

Yes. Service should also aggregate metrics like number of URL redirections per day and other analytics for targeted advertisements.

3. System Design goals

Functional Requirements:

  • Service should be able to create shortened url/links against a long url
  • Click to the short URL should redirect the user to the original long URL
  • Shortened link should be as small as possible
  • Users can create custom url with maximum character limit of 16
  • Service should collect metrics like most clicked links
  • Once a shortened link is generated it should stay in system for lifetime

Non-Functional Requirements:

  • Service should be up and running all the time
  • URL redirection should be fast and should not degrade at any point of time (Even during peak loads)
  • Service should expose REST API’s so that it can be integrated with third party applications

4. Traffic and System Capacity

Traffic

Assuming 200:1 read/write ratio

Number of unique shortened links generated per month = 100 million

Number of unique shortened links generated per seconds = 100 million /(30 days * 24 hours * 3600 seconds ) ~ 40 URLs/second

With 200:1 read/write ratio, number of redirections = 40 URLs/s * 200 = 8000 URLs/s

Storage

Assuming lifetime of service to be 100 years and with 100 million shortened links creation per month, total number of data points/objects in system will be = 100 million/month * 100 (years) * 12 (months) = 120 billion

Assuming size of each data object (Short url, long url, created date etc.) to be 500 bytes long, then total require storage = 120 billion * 500 bytes =60TB

Memory

Following Pareto Principle, better known as the 80:20 rule for caching. (80% requests are for 20% data)

Since we get 8000 read/redirection requests per second, we will be getting 700 million requests per day:

8000/s * 86400 seconds =~700 million

To cache 20% of these requests, we will need ~70GB of memory.

0.2 * 700 million * 500 bytes = ~70GB

Estimates

Shortened URLs generated : 40/s

Shortened URLs generated in 100 years : 120 billion

Total URL requests/redirections : 8000/s

Storage for 100 years : 60TB

Cache memory : 70GB

5. High Level Design

Following is high level design of our URL service. This is a rudimentary design. We will optimise this further as we move along.

Fig. 1 A Rudimentary design for URL service

Problems with above design :

  1. There is only one WebServer which is single point of failure (SPOF)
  2. System is not scalable
  3. There is only single database which might not be sufficient for 60 TB of storage and high load of 8000/s read requests

To cater above limitations we :

  1. Added a load balancer in front of WebServers
  2. Sharded the database to handle huge object data
  3. Added cache system to reduce load on the database.

We will delve further into each component when we will go through the algorithms in later sections

Fig. 2 Scalable high level design

6. Algorithm REST Endpoints

Let’s starts by making two functions accessible through REST API:

create( long_url, api_key, custom_url)

POST
Tinyrl : POST : https://tinyurl.com/app/api/create
Request Body: {url=long_url}
Return OK (200), with the generated short_url in data

long_url: A long URL that needs to be shortened.

api_key: A unique API key provided to each user, to protect from the spammers, access, and resource control for the user, etc.

custom_url(optional): The custom short link URL, user want to use.

Return Value: The short Url generated, or error code in case of the inappropriate parameter.

GET: /{short_url}
Return a http redirect response(302)

Note : “HTTP 302 Redirect” status is sent back to the browser instead of “HTTP 301 Redirect”. A 301 redirect means that the page has permanently moved to a new location. A 302 redirect means that the move is only temporary. Thus, returning 302 redirect will ensure all requests for redirection reaches to our backend and we can perform analytics (Which is a functional requirement)

short_url: The short URL generated from the above function.

Return Value: The original long URL, or invalid URL error code.

7. Database Schema

Let’s see the data we need to store:

Data Related to user

  1. User ID: A unique user id or API key to make user globally distinguishable
  2. Name: The name of the user
  3. Email: The email id of the user
  4. Creation Date: The date on which the user was registered

Data Related to ShortLink

  1. Short Url: 6/7 character long unique short URL
  2. Original Url: The original long URL
  3. UserId: The unique user id or API key of the user who created the short URL

8. Shortening Algorithm

For shortening a url we can use following two solutions (URL encoding and Key Generation service). Let’s walk through each of them one by one.

a.) URL Encoding

i.) URL encoding through base62

ii) URL encoding through MD5

b.) Key Generation Service (KGS)

URL encoding through base62

A base is a number of digits or characters that can be used to represent a particular number.

Base 10 are digits [0–9], which we use in everyday life and

base 62 are [0–9][a-z][A-Z]

Let’s do a back of the envelope calculation to find out how many characters shall we keep in our tiny url.

URL with length 5, will give 62⁵ = ~916 Million URLs
URL with length 6, will give 62⁶ = ~56 Billion URLs
URL with length 7, will give 62⁷ = ~3500 Billion URLs

Since we required to produce 120 billion URLs, with 7 characters in base62 we will get ~3500 Billion URLs. Hence each of tiny url generated will have 7 characters

How to get unique ‘7 character’ long random URLs in base62

Once we have decided number of characters to use in Tiny URL (7 characters) and also the base to use (base 62 [0–9][a-z][A-Z] ), then the next challenge is how to generate unique URLs which are 7 characters long.

Technique 1 —Short url from random numbers:

We could just make a random choice for each character and check if this tiny url exists in DB or not. If it doesn’t exist return the tiny url else continue rolling/retrying.As more and more 7 characters short links are generated in Database, we would require 4 rolls before finding non-existing one short link which will slow down tiny url generation process.

Java code for randomly generating 7 characters long tiny url

private static final int NUM_CHARS_SHORT_LINK = 7;
private static final String ALPHABET = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789";
private Random random = new Random();public String generateRandomShortUrl() {
char[] result = new char[NUM_CHARS_SHORT_LINK];
while (true) {
for (int i = 0; i < NUM_CHARS_SHORT_LINK; i++) {
int randomIndex = random.nextInt(ALPHABET.length() - 1);
result[i] = ALPHABET.charAt(randomIndex);
}
String shortLink = new String(result); // make sure the short link isn't already used
if (!DB.checkShortLinkExists(shortLink)) {
return shortLink;;
}
}
}

Technique 2 — Short urls from base conversion:

Think of the seven-bit short url as a hexadecimal number (0–9, a-z, A-Z) (For e.g. aKc3K4b) . Each short url can be mapped to a decimal integer by using base conversion and vice versa.

How do we do the base conversion? This is easiest to show by an example.

Take the number 125 in base 10.

It has a 1 in the 100s place, a 2 in the 10s place, and a 5 in the 1s place. In general, the places in a base-10 number are:

  • 10⁰
  • 10¹
  • 10²
  • 10³
  • etc

The places in a base-62 number are:

  • 62⁰
  • 62¹
  • 62²=3,844
  • 62³=238,328
  • etc.

So to convert 125 to base-62, we distribute that 125 across these base-62 “places.” The highest “place” that can take some is 62¹, which is 62. 125/62 is 2, with a remainder of 1. So we put a 2 in the 62’s place and a 1 in the 1’s place. So our answer is 21.

What about a higher number — say, 7,912?

Now we have enough to put something in the 3,844’s place (the 62²’s place). 7,912 / 3,844 is 2 with a remainder of 224. So we put a 2 in the 3,844’s place, and we distribute that remaining 224 across the remaining places — the 62’s place and the 1’s place. 224 / 62 is 3 with a remainder of 38. So we put a 3 in the 62’s place and a 38 in the 1’s place. We have this three-digit number: 23- 38.

Now, that “38” represents one numeral in our base-62 number. So we need to convert that 38 into a specific choice from our set of numerals: a-z, A-Z, and 0–9.

Let’s number each of our 62 numerals, like so:

0: 0,
1: 1,
2: 2,
3: 3,
...
10: a,
11: b,
12: c,
...
36: A,
37: B,
38: C,
...
61: Z

As you can see, our “38th” numeral is “C.” So we convert that 38 to a “C.” That gives us 23C.

So we can start with a counter (A Large number 100000000000 in base 10 which 1L9zO9O in base 62) and increment counter every-time we get request for new short url (100000000001, 100000000002, 100000000003 etc.) .This way we will always get a unique short url.

100000000000 (Base 10) ==> 1L9zO9O (Base 62)

Similarly, when we get tiny url link for redirection we can convert this base62 tiny url to a integer in base10

1L9zO9O (Base 62) ==>100000000000 (Base 10)

Java code for generating 7 characters long tiny url with a counter

public class URLService {
HashMap<String, Integer> ltos;
HashMap<Integer, String> stol;
static int COUNTER=100000000000;
String elements;
URLService() {
ltos = new HashMap<String, Integer>();
stol = new HashMap<Integer, String>();
COUNTER = 100000000000;
elements = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"; }
public String longToShort(String url) {
String shorturl = base10ToBase62(COUNTER);
ltos.put(url, COUNTER);
stol.put(COUNTER, url);
COUNTER++;
return "http://tiny.url/" + shorturl;
}
public String shortToLong(String url) {
url = url.substring("http://tiny.url/".length());
int n = base62ToBase10(url);
return stol.get(n);
}

public int base62ToBase10(String s) {
int n = 0;
for (int i = 0; i < s.length(); i++) {
n = n * 62 + convert(s.charAt(i));
}
return n;

}
public int convert(char c) {
if (c >= '0' && c <= '9')
return c - '0';
if (c >= 'a' && c <= 'z') {
return c - 'a' + 10;
}
if (c >= 'A' && c <= 'Z') {
return c - 'A' + 36;
}
return -1;
}
public String base10ToBase62(int n) {
StringBuilder sb = new StringBuilder();
while (n != 0) {
sb.insert(0, elements.charAt(n % 62));
n /= 62;
}
while (sb.length() != 7) {
sb.insert(0, '0');
}
return sb.toString();
}

Technique 3 — MD5 hash :

The MD5 message-digest algorithm is a widely used hash function producing a 128-bit hash value(or 32 hexadecimal digits). We can use these 32 hexadecimal digit for generating 7 characters long tiny url.

  • Encode the long URL using the MD5 algorithm and take only the first 7 characters to generate TinyURL.
  • The first 7 characters could be the same for different long URLs so check the DB to verify that TinyURL is not used already
  • Try next 7 characters of previous choice of 7 characters already exist in DB and continue until you find a unique value
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;
public class MD5Utils { private static int SHORT_URL_CHAR_SIZE=7;
public static String convert(String longURL) {
try {
// Create MD5 Hash
MessageDigest digest = MessageDigest.getInstance("MD5");
digest.update(longURL.getBytes());
byte messageDigest[] = digest.digest();
// Create Hex String
StringBuilder hexString = new StringBuilder();
for (byte b : messageDigest) {
hexString.append(Integer.toHexString(0xFF & b));
}
return hexString.toString();
} catch (NoSuchAlgorithmException e) {
throw new RuntimeException(e);
}
}
public static String generateRandomShortUrl(String longURL) {
String hash=MD5Utils.convert(longURL);
int numberOfCharsInHash=hash.length();
int counter=0;
while(counter < numberOfCharsInHash-SHORT_URL_CHAR_SIZE){
if(!DB.exists(hash.substring(counter, counter+SHORT_URL_CHAR_SIZE))){
return hash.substring(counter, counter+SHORT_URL_CHAR_SIZE);
}
counter++;
}
}
}

With all these shortening algorithms, let’s revisit our design goals!

  1. Being able to store a lot of short links (120 billion)
  2. Our TinyURL should be as short as possible (7 characters)
  3. Application should be resilient to load spikes (For both url redirections and short link generation)
  4. Following a short link should be fast

All the above techniques we discussed above will help us in achieving goal (1) and (2). But will fail at step (3) and step (4). Let’s see how ?

  1. A single web-server is a single point of failure (SPOF). If this web-server goes down, none of our users will be able to generate tiny urls/or access original (long) urls from tiny urls. This can be handled by adding more web-servers for redundancy and then bringing a load balancer in front but even with this design choice next challenge will come from database
  2. With database we have two options :

a. Relational databases (RDBMs) like MySQL and Postgres

b. “NoSQL”-style databases like BigTable and Cassandra

If we choose RDBMs as our database to store the data then it is efficient to check if an URL exists in database and handle concurrent writes. But RDBMs are difficult to scale (We will use RDBMS for scaling in one of our Technique and will see how to scale using RDBMS)

If we opt for “NOSQL”, we can leverage scaling power of “NOSQL”-style databases but these systems are eventually consistent (Unlike RDBMs which provides ACID consistency)

We can use putIfAbsent(TinyURL, long URL) or INSERT-IF-NOT-EXIST condition while inserting the tiny URL but this requires support from DB which is available in RDBMS but not in NoSQL. Data is eventually consistent in NoSQL so putIfAbsent feature support might not be available in the NoSQL database(We can handle this using some of the features of NOSQL which we have discussed later on in this article). This can cause consistency issue (Two different long URLs having the same tiny url).

Scaling Technique 1 — Short url from random numbers

We can use a relational database as our backend store but since we don’t have joins between objects and we require huge storage size (60TB) along with high writes (40 URL’s per seconds) and read (8000/s) speed, a NoSQL store (MongoDB, Cassandra etc.) will be better choice.

We can certainly scale using SQL (By using custom partitioning and replication which is by default available in MongoDB and Cassandra) but this difficult to develop and maintain

Let’s discuss how to use MongoDB to scale database for shortening algorithm given in Technique — 1

MongoDB supports distributing data across multiple machines using shards. Since we have to support large data sets and high throughput, we can leverage sharding feature of MongoDB.

Once we generate 7 characters long tiny url, we can use this tinyURL as shard key and employ sharding strategy as hashed sharding. MongoDB automatically computes the hashes (Using tiny url ) when resolving queries. Applications do not need to compute hashes. Data distribution based on hashed values facilitates more even data distribution, especially in data sets where the shard key changes monotonically.

We can scale reads/writes by increasing number of shards for our collection containing tinyURL data(Schema having three fields Short Url,Original Url,UserId).

Schema for collection tiny URL :

{
_id: <ObjectId102>,
shortUrl: "https://tinyurl.com/3sh2ps6v",
originalUrl: "https://medium.com/@sandeep4.verma",
userId: "sandeepv",
}

Since MongoDB supports transaction for a single document, we can maintain consistency and because we are using hash based shard key, we can efficiently use putIfAbsent (As a hash will always be corresponding to a shard)

To speed up reads (checking whether a Short URL exists in DB or what is Original url corresponding to a short URL) we can create indexing on ShortURL.

We will also use cache to further speed up reads (We will discuss caching in later part of this article)

Scaling Technique 2 — Short urls from base conversion

We used a counter (A large number) and then converted it into a base62 7 character tinyURL.As counters always get incremented so we can get a new value for every new request (Thus we don’t need to worry about getting same tinyURL for different long/original urls)

Scaling with SQL sharding and auto increment

Sharding is a scale-out approach in which database tables are partitioned, and each partition is put on a separate RDBMS server. For SQL, this means each node has its own separate SQL RDBMS managing its own separate set of data partitions. This data separation allows the application to distribute queries across multiple servers simultaneously, creating parallelism and thus increasing the scale of that workload. However, this data and server separation also creates challenges, including sharding key choice, schema design, and application rewrites. Additional challenges of sharding MySQL include data maintenance, infrastructure maintenance, and business challenges.

Before an RDBMS can be sharded, several design decisions must be made. Each of these is critical to both the performance of the sharded array, as well as the flexibility of the implementation going forward. These design decisions include the following:

  • Sharding key must be chosen
  • Schema changes
  • Mapping between sharding key, shards (databases), and physical servers

We can use sharding key as auto-incrementing counter and divide them into ranges for example from 1 to 10M, server 2 ranges from 10M to 20M, and so on.

We can start the counter from 100000000000. So counter for each SQL database instance will be in range 100000000000+1 to 100000000000+10M , 100000000000+10M to 100000000000+20M and so on.

We can start with 100 database instances and as and when any instance reaches maximum limit (10M), we can stop saving objects there and spin up a new server instance. In case one instance is not available/or down or when we require high throughput for write we can spawn multiple new server instances.

Schema for collection tiny URL For RDBMS :

CREATE TABLE tinyUrl (
id BIGINT NOT NULL, AUTO_INCREMENT
shortUrl VARCHAR(7) NOT NULL,
originalUrl VARCHAR(400) NOT NULL,
userId VARCHAR(50) NOT NULL,
automatically on primary-key column
-- INDEX (shortUrl)
-- INDEX (originalUrl)
);

Where to keep which information about active database instances ?

Solution: We can use a distributed service Zookeeper to solve the various challenges of a distributed system like a race condition, deadlock, or particle failure of data. Zookeeper is basically a distributed coordination service that manages a large set of hosts. It keeps track of all the things such as the naming of the servers, active database servers, dead servers, configuration information (Which server is managing which range of counters)of all the hosts. It provides coordination and maintains the synchronization between the multiple servers.
Let’s discuss how to maintain a counter for distributed hosts using Zookeeper.

  • From 3500 Billion URLs combinations take 1st billion combinations.
  • In Zookeeper maintain the range and divide the 1st billion into 100 ranges of 10 million each i.e. range 1->(1–1,000,0000), range 2->(1,000,0001–2,000,0000)…. range 1000->(999,000,0001–1,000,000,0000) (Add 100000000000 to each range for counter)
  • When servers will be added these servers will ask for the unused range from Zookeepers. Suppose the W1 server is assigned range 1, now W1 will generate the tiny URL incrementing the counter and using the encoding technique. Every time it will be a unique number so there is no possibility of collision and also there is no need to keep checking the DB to ensure that if the URL already exists or not. We can directly insert the mapping of a long URL and short URL into the DB.
  • In the worst case, if one of the servers goes down then only that range of data is affected. We can replicate data of master to it’s slave and while we try to bring master back, we can divert read queries to it’s slaves
  • If one of the database reaches its maximum range or limit then we can move that database instance out from active database instances which can accept write and add a new database with a new a new fresh range and add this to Zookeeper. This will only be used for reading purpose
  • The Addition of a new database is also easy. Zookeeper will assign an unused counter range to this new database.
  • We will take the 2nd billion when the 1st billion is exhausted to continue the process.

How to check whether short URL is present in database or not ?

Solution : When we get tiny url (For example 1L9zO9O) we can use base62ToBase10 function to get the counter value (100000000000). Once we have this values we can get which database this counter ranges belongs to from zookeeper(Let’s say it database instance 1 ) . Then we can send SQL query to this server (Select * from tinyUrl where id=10000000000111).This will provide us sql row data (*if present)

How to get original url corresponding to tiny url ?

Solution : We can leverage the above solution to get back the sql row data. This data will have shortUrl, originalURL and userId.

Scaling Technique 3— MD5 hash

We can leverage the scaling Technique 1 (Using MongoDB). We can also use Cassandra in place of MongoDB. In Cassandra instead of using shard key we will use partition key to distribute our data.

Technique 4 Key Generation Service (KGS)

We can have a standalone Key Generation Service (KGS) that generates random seven-letter strings beforehand and stores them in a database (let’s call it key-DB). Whenever we want to shorten a URL, we will take one of the already-generated keys and use it. This approach will make things quite simple and fast. Not only are we not encoding the URL, but we won’t have to worry about duplications or collisions. KGS will make sure all the keys inserted into key-DB are unique

Can concurrency cause problems? As soon as a key is used, it should be marked in the database to ensure that it is not used again. If there are multiple servers reading keys concurrently, we might get a scenario where two or more servers try to read the same key from the database. How can we solve this concurrency problem?

Servers can use KGS to read/mark keys in the database. KGS can use two tables to store keys: one for keys that are not used yet, and one for all the used keys. As soon as KGS gives keys to one of the servers, it can move them to the used keys table. KGS can always keep some keys in memory to quickly provide them whenever a server needs them.

For simplicity, as soon as KGS loads some keys in memory, it can move them to the used keys table. This ensures each server gets unique keys. If KGS dies before assigning all the loaded keys to some server, we will be wasting those keys–which could be acceptable, given the huge number of keys we have.

KGS also has to make sure not to give the same key to multiple servers. For that, it must synchronize (or get a lock on) the data structure holding the keys before removing keys from it and giving them to a server.

Isn’t KGS a single point of failure? Yes, it is. To solve this, we can have a standby replica of KGS. Whenever the primary server dies, the standby server can take over to generate and provide keys.

Can each app server cache some keys from key-DB? Yes, this can surely speed things up. Although, in this case, if the application server dies before consuming all the keys, we will end up losing those keys. This can be acceptable since we have 68B unique six-letter keys.

How would we perform a key lookup? We can look up the key in our database to get the full URL. If its present in the DB, issue an “HTTP 302 Redirect” status back to the browser, passing the stored URL in the “Location” field of the request. If that key is not present in our system, issue an “HTTP 404 Not Found” status or redirect the user back to the homepage.

Should we impose size limits on custom aliases? Our service supports custom aliases. Users can pick any ‘key’ they like, but providing a custom alias is not mandatory. However, it is reasonable (and often desirable) to impose a size limit on a custom alias to ensure we have a consistent URL database. Let’s assume users can specify a maximum of 16 characters per customer key

9. Cache

We can cache URLs that are frequently accessed. We can use some off-the-shelf solution like Memcached, which can store full URLs with their respective hashes. Before hitting backend storage, the application servers can quickly check if the cache has the desired URL.

How much cache memory should we have? We can start with 20% of daily traffic and, based on clients’ usage patterns, we can adjust how many cache servers we need. As estimated above, we need 70GB memory to cache 20% of daily traffic. Since a modern-day server can have 256GB memory, we can easily fit all the cache into one machine. Alternatively, we can use a couple of smaller servers to store all these hot URLs.

Which cache eviction policy would best fit our needs? When the cache is full, and we want to replace a link with a newer/hotter URL, how would we choose? Least Recently Used (LRU) can be a reasonable policy for our system. Under this policy, we discard the least recently used URL first. We can use a Linked Hash Map or a similar data structure to store our URLs and Hashes, which will also keep track of the URLs that have been accessed recently.

To further increase the efficiency, we can replicate our caching servers to distribute the load between them.

How can each cache replica be updated? Whenever there is a cache miss, our servers would be hitting a backend database. Whenever this happens, we can update the cache and pass the new entry to all the cache replicas. Each replica can update its cache by adding the new entry. If a replica already has that entry, it can simply ignore it.

10. Load Balancer (LB)

We can add a Load balancing layer at three places in our system:

  1. Between Clients and Application servers
  2. Between Application Servers and database servers
  3. Between Application Servers and Cache servers

Initially, we could use a simple Round Robin approach that distributes incoming requests equally among backend servers. This LB is simple to implement and does not introduce any overhead. Another benefit of this approach is that if a server is dead, LB will take it out of the rotation and will stop sending any traffic to it.

A problem with Round Robin LB is that we don’t take the server load into consideration. If a server is overloaded or slow, the LB will not stop sending new requests to that server. To handle this, a more intelligent LB solution can be placed that periodically queries the backend server about its load and adjusts traffic based on that.

11. Customer provided tiny URLs

A customer can also provide a tiny url of his/her choice. We can allow customer to choose just alphanumerics and the “special characters” $-_.+!*’(),. in tiny url(With let’s say 8 minimum characters). When customer provides a custom url we can save it to some other instance of database (Different one from where system generates tiny url against original url) and treat these tiny URLs as special URLs. When we get redirection request we can divert these request to special instances of WebServer . Since this will be a paid service we can expect very few tiny URLs to be of this kind and we don’t need to worry about scaling WebServers/Database in this case.

12. Analytics

How many times a short URL has been used ? How would we store these statistics?

Since we used “HTTP 302 Redirect” status to the browser instead of “HTTP 301 Redirect”, thus each redirection for tiny url will reach service’s backend. We can push this data(tiny url, users etc.) to a Kafka queue and perform analytics in real time

Don’t forget to clap if you enjoyed reading this article!

References:

https://www.geeksforgeeks.org/how-to-design-a-tiny-url-or-url-shortener

--

--