I've been looking in many places about this question - most top Google search results are copy-pasted from a single source, and others are not particularly helpful. Not sure if I am allowed to include the links of my search - so refraining for now.
URL shortening has 2 aspects :
- Write : get a long URL, write a mapping of short vs long, return short URL.
- Read : get a short URL : find the long URL, and maybe redirect to it.
Common design I see is :
> (Key Generation service) ----> (Application server)
> | |
> | |
> | |
> V V
> (Key DB) DB with Short vs Long mapping
Seems like the same Application server is doing read and write activities.
Now for data partitioning - I am getting a single advice from almost everywhere : hashing (possibly consistent) of long URL to a particular server.This only helps in the write aspect. For reading, we need to check every application server : do you have a long URL for this short URL ? And every one admits that reading is much much more important for this design than writing. So they are optimizing for the least important use case ?
Am I missing something very basic ?