Jorge González Rubio

Jorge González Rubio

SysAdmin & DevOps Engineer
Related topics: Cloud Systems Web Development

What is a CDN and why do your digital assets need to have it?

Tuesday January 26th, 2021
2 minutos

According to Wikipedia: “Content Delivery Network (CDN) is an overlay network of computers containing copies of data, placed at various points on a network in order to maximize bandwidth for the access to customer data over the network. A client accesses a copy of the information close to the client, as opposed to all clients accessing the same central server, in order to avoid funnels near that server. ”

But, what is exactly a CDN? In this article we will try to explain what a CDN is in a simple way and why you should have one.

 

What is a CDN?

Suppose we are a newspaper/blog and our servers are in a CPD (place where our servers are physically) in Madrid. From now on, we will call our platform “origin”. When we refer to “our origin”, we are talking about our servers that contain the web, and the language of our site is Spanish. Our supposed newspaper/blog will be consulted by users outside of Madrid, outside of Spain and outside of Europe. Being a site in Spanish, much of the traffic we receive will be from Latin America (for an obvious reason, the language).

We have our supposed site in mind and now we are going to see what happens with the traffic that comes from the different users.

User1, located in the north of Spain begins to browse our site. Enter the front page and start generating traffic, which will go from his location to Madrid (images, videos, etc ..). The request will reach our servers in Madrid, which will generate the content. Once the object is generated, it will be sent to user1 to his location.

This time invested in generating the resource requested by the user1 implies many more things: bandwidth, CPU, RAM, etc… So the response time will increase the more requests it must process at the same time. Now, with a single user requesting an object, it is insignificant, but imagine thousands of users requesting the same resource at the same time: we would start to take longer and longer to deliver the content to the user, slowing down the server more and more and being able to arrive, in the worst case, even to throw it away.

What can we do to avoid these problems? The CDN has the answer. What we will do is put the CDN between the user and the origin (we remember that the origin will be our infrastructure/servers in which our website is located), in such a way that all the traffic generated by the user will first go to the CDN, the CDN will receive it and it will be the one who sends it to the origin.

The CDN has many advantages that we will see, but the main one is web acceleration, and this is achieved thanks to the fact that it keeps (caching) the objects and resources that it is serving. The time that the objects will be stored in the CDN will be set by the HTTP header “Cache-Control”, but we are not going to delve into this topic since it is not the purpose of this post.

Going back to our previous example. Our user1 located in the north of Spain starts browsing and asks for the front page of the site. That request will reach the CDN first, which will not have the requested object since no one has asked it before. Not having it, the CDN makes the request to the origin, which will return the object. This object is received by the CDN and saves a copy (cache) to later deliver it to user1.

Now we imagine that a moment later, User2 arrives and again asks for the cover (same as User1). Again, the request will go to the CDN, with the difference that now it does have the resource that User2 is requesting. So it will be delivered directly without having to go to the origin.

We can now see how a CDN works and some of its main advantages.

Now let’s suppose now a User3 who lives in Mexico and also navigates to the front page of our site. Without CDN, that request must make a transoceanic jump, reach our servers in Madrid, and back to Mexico. All this for each request and navigation that is made.

Now we are going to imagine thousands of requests from users in Latin America: all the delay that occurs only because of the distance (and the corresponding user experience and how it is affected by the slowness). Although our site is super-mega-ultra-optimized, the physical factor of distance can never be avoided. To give it a solution, we will put the CDN again, since it has a multitude of PoP’s (points of presence) distributed throughout the world, which will be used according to the proximity of the user. This service is called geo-positioning and in our case it is by IP. Depending on our IP, the CDN will know where the request comes from and will send us to the nearest node (CDN server), drastically increasing the speed of content delivery.

Image taken from “Speed ​​Index: Optimization and Measurement Guideline” by Holistic SEO.

Although it is true that the first request made for an object will not be cached and must reach our origin, once the object is cached, in the successive requests made of it, deliveries will be made from the CDN node closer, and not from the origin.

A possible solution for not having what is vulgarly called the «cold cache» is to force requests to your site from different places in order to “heat up” the cache; something similar to “auto-crawling” (a tool that enters and analyzes all the URLs of a site).

 

Conclusion

We can say that a CDN is a layer with a distributed system of nodes that are located between the user and the origin to offer the following advantages:

  • Improvement in the user experience: by having the content cached in the CDN, navigation is much more fluid, since we gain that processing latency and content generation.
  • Content delivery speed: thanks to IP geo-positioning, we will always go to our closest PoP, drastically reducing response times.
  • Availability: having the site and its headers well configured, we will have the cache optimized. In this way, we will have a layer on top of our servers that will act as a shield against a fall in our origin, since the content would be served from the CDN and the user will not notice any problem during navigation.
  • Security: CDNs have quite strong equipment and technologies that analyze traffic, request patterns and, when faced with certain strange behaviors, can block requests. Another good practice is to close access to origin only to the IPs of the CND, in such a way that we cannot receive requests from anywhere other than the nodes of the CDN.
  • Positioning: by improving the speed of content delivery, a better SEO positioning is favored.

I hope this article made it clear what a CDN is and why you should have one. It’s a very big and complex topic, so stay tuned to our blog, as we will continue to upload content related to CDNs and much more.

And of course, if you are interested in what a CDN can contribute to your business, do not hesitate to write to us at info@makingscience.com