Photo by Solen Feyissa on Unsplash
What happens when you type google.com in your browser and press Enter?
It might seem like a simple action. Behind the scenes, however, a complex series of steps occurs. We will take a deeper look into the journey of a URL and explore what happens when you initiate this process.
DNS Request
The domain name system (DNS) is a naming database in which Internet domain names are located and translated into Internet Protocol (IP) addresses. The domain name system maps the name people use to locate a website to the IP address that a computer uses to locate that website.
For example, if someone types "google.com" into a web browser, a server behind the scene maps that name to the corresponding IP address. Example: 172.217.170.206
How DNS works
DNS servers convert URLs and domain names into IP addresses that computers can understand and use. They translate what a user types into a browser into something the machine can use to find a webpage. This process of translation and lookup is called DNS resolution.
The basic process of a DNS resolution follows these steps:
The user enters a web address or domain name into a browser.
The browser sends a message, called a recursive DNS query, to the network to find out which IP or network address the domain corresponds to.
The query goes to a recursive DNS server, which is also called a recursive resolver, and is usually managed by the internet service provider (ISP). If the recursive resolver has the address, it will return the address to the user, and the webpage will load.
If the recursive DNS server does not have an answer, it will query a series of other servers in the following order: DNS root name servers, top-level domain (TLD) name servers and authoritative name servers.
The three server types work together and continue redirecting until they retrieve a DNS record that contains the queried IP address. It sends this information to the recursive DNS server and the webpage the user is looking for loads. DNS root name servers and TLD servers primarily redirect queries and rarely provide the resolution themselves.
The recursive server stores, or caches, the A record for the domain name, which contains the IP address. The next time it receives a request for that domain name, it can respond directly to the user instead of querying other servers.
If the query reaches the authoritative server and it cannot find the information, it returns an error message.
The entire process of querying the various servers takes a fraction of a second and is usually imperceptible to the user.
TCP/IP
Using the IP address, the browser establishes a connection to Google's servers using the Transmission Control Protocol (TCP) and the Internet Protocol (IP). TCP ensures the data is transmitted reliably between your browser and Google's servers.
What Does TCP/IP Do?
The main work of TCP/IP is to transfer the data of a computer from one device to another. The main condition of this process is to make data reliable and accurate so that the receiver will receive the same information that is sent by the sender. To ensure that, each message reaches its final destination accurately, the TCP/IP model divides its data into packets and combines them at the other end, which helps in maintaining the accuracy of the data while transferring from one end to another end.
What is the difference between TCP and IP?
TCP and IP are different protocols of Computer Networks. The basic difference between TCP (Transmission Control Protocol) and IP (Internet Protocol) is in the transmission of data. In simple words, IP finds the destination of the IP search and TCP has the work to send and receive the response.
Firewall
As part of the process, your data passes through your computer's firewall and any traffic firewall in place. Firewalls are like gatekeepers that ensure security by filtering incoming and outgoing traffic.
What Is Firewall?
A firewall is a network security device that observes and filters incoming and outgoing network traffic, adhering to the security policies defined by an organization. Essentially, it acts as a protective wall between a private internal network and the public Internet.
Firewalls are used in enterprise and personal settings. They are a vital component of network security. Most operating systems have a basic built-in firewall. However, using a third-party firewall application provides better protection.
Types of Firewalls
A firewall can either be software or hardware. Software firewalls are programs installed on each computer, and they regulate network traffic through applications and port numbers. Meanwhile, hardware firewalls are the equipment established between the gateway and your network. Additionally, you call a firewall delivered by a cloud solution as a cloud firewall.
There are multiple types of firewalls based on their traffic filtering methods, structure, and functionality. A few of the types of firewalls are:
- Packet Filtering
A packet-filtering firewall controls data flow to and from a network. It allows or blocks the data transfer based on the packet's source address, the destination address of the packet, the application protocols to transfer the data, and so on.
- Proxy Service Firewall
This type of firewall protects the network by filtering messages at the application layer. For a specific application, a proxy firewall serves as the gateway from one network to another.
- Stateful Inspection
Such a firewall permits or blocks network traffic based on state, port, and protocol. Here, it decides filtering based on administrator-defined rules and context.
- Next-Generation Firewall
According to Gartner Inc.’s definition, the next-generation firewall is a deep-packet inspection firewall that adds application-level inspection, intrusion prevention, and information from outside the firewall to go beyond port/protocol inspection and blocking.
- Unified Threat Management (UTM) Firewall
A UTM device generally integrates the capabilities of a stateful inspection firewall, intrusion prevention, and antivirus in a loosely linked manner. It may include additional services and, in many cases, cloud management. UTMs are designed to be simple and easy to use.
- Threat-Focused NGFW
These firewalls provide advanced threat detection and mitigation. With network and endpoint event correlation, they may detect evasive or suspicious behavior.
How Does a Firewall Work?
A firewall analyses which traffic should be allowed or restricted based on a set of rules. Think of the firewall as a gatekeeper at your computer’s entry point which only allows trusted sources, or IP addresses, to enter your network.
A firewall welcomes only those incoming traffic that has been configured to accept. It distinguishes between good and malicious traffic and either allows or blocks specific data packets on pre-established security rules.
These rules are based on several aspects indicated by the packet data, like their source, destination, content, and so on. They block traffic coming from suspicious sources to prevent cyberattacks.
For example, the image depicted below shows how a firewall allows good traffic to pass to the user’s private network.
However, in the example below, the firewall blocks malicious traffic from entering the private network, thereby protecting the user’s network from being susceptible to a cyberattack.
This way, a firewall carries out quick assessments to detect malware and other suspicious activities.
HTTPS/SSL
Like many other websites, Google uses HTTPS (Hypertext Transfer Protocol Secure). This is where SSL (Secure Socket Layer) or TLS (Transfer Layer Security) encryption comes into play. The data being transmitted between your browser and Google's servers are encrypted to ensure privacy and security.
HTTPS is the secure version of HTTP, which is the primary protocol used to send data between a web browser and a website. HTTPS increases the security of data transfer. This is particularly important when users transmit sensitive data, such as by logging into a bank account, email service, or health insurance provider.
Any website, especially those that require login credentials, should use HTTPS. In modern web browsers such as Chrome, websites that do not use HTTPS are marked differently than those that do. Look for a padlock in the URL bar to signify the webpage is secure. Web browsers take HTTPS seriously; Google Chrome and other browsers flag all non-HTTPS websites as not secure.
How does HTTPS work?
HTTPS uses an encryption protocol to encrypt communications. The protocol is called Transport Layer Security (TLS), although formerly it was known as Secure Sockets Layer (SSL). This protocol secures communications by using what’s known as an asymmetric public key infrastructure. This type of security system uses two different keys to encrypt communications between two parties:
The private key - this key is controlled by the owner of a website and it’s kept, as the reader may have speculated, private. This key lives on a web server and is used to decrypt information encrypted by the public key.
The public key - this key is available to everyone who wants to interact with the server securely. Information that’s encrypted by the public key can only be decrypted by the private key.
Why is HTTPS important? What happens if a website doesn’t have HTTPS?
HTTPS prevents websites from having their information broadcast in a way that’s easily viewed by anyone snooping on the network. When information is sent over regular HTTP, the information is broken into packets of data that can be easily “sniffed” using free software. This makes communication over an insecure medium, such as public Wi-Fi, highly vulnerable to interception. All communications that occur over HTTP occur in plain text, making them highly accessible to anyone with the correct tools, and vulnerable to on-path attacks.
With HTTPS, traffic is encrypted such that even if the packets are sniffed or otherwise intercepted, they will come across as nonsensical characters. Let’s look at an example:
Before encryption:
This is a string of text that is completely readable
After encryption:
ITM0IRyiEhVpa6VnKyExMiEgNveroyWBPlgGyfkflYjDaaFf/Kn3bo3OfghBPDWo6AfSHlNtL8N7ITEwIXc1gU5X73xMsJormzzXlwOyrCs+9XCPk63Y+z0=
In websites without HTTPS, Internet service providers (ISPs) or other intermediaries can inject content into web pages without the approval of the website owner. This commonly takes the form of advertising, where an ISP looking to increase revenue injects paid advertising into the web pages of their customers. Unsurprisingly, when this occurs, the profits for the advertisements and the quality control of those advertisements are in no way shared with the website owner. HTTPS eliminates the ability of unmoderated third parties to inject advertising into web content.
Load-Balancer
At any given moment, there could be millions of requests being received by Google's servers. To ensure that no single server becomes overwhelmed and that your request is handled correctly, Google employs the use of load balancers. These devices/software applications distribute user requests efficiently among multiple web servers.
Load balancing refers to efficiently distributing incoming network traffic across a group of backend servers, also known as a server farm or server pool.
Modern high‑traffic websites must serve hundreds of thousands, if not millions, of concurrent requests from users or clients and return the correct text, images, video, or application data, all in a fast and reliable manner. To cost‑effectively scale to meet these high volumes, modern computing best practice generally requires adding more servers.
A load balancer acts as the “traffic cop” sitting in front of your servers and routing client requests across all servers capable of fulfilling those requests in a manner that maximizes speed and capacity utilization and ensures that no one server is overworked, which could degrade performance. If a single server goes down, the load balancer redirects traffic to the remaining online servers. When a new server is added to the server group, the load balancer automatically starts to send requests to it. In this manner, a load balancer performs the following functions:
Distributes client requests or network load efficiently across multiple servers
Ensures high availability and reliability by sending requests only to servers that are online
Provides the flexibility to add or subtract servers as demand dictates
Load Balancing Algorithms
Different load balancing algorithms provide different benefits; the choice of load balancing method depends on your needs:
Round Robin – Requests are distributed across the group of servers sequentially.
Least Connections – A new request is sent to the server with the fewest current connections to clients. The relative computing capacity of each server is factored into determining which one has the least connections.
Least Time – Sends requests to the server selected by a formula that combines the fastest response time and fewest active connections.
Hash – Distributes requests based on a key you define, such as the client IP address or the request URL.
IP Hash – The IP address of the client is used to determine which server receives the request.
Random with Two Choices – Picks two servers at random and sends the request to the one that is selected by then applying the Least Connections algorithm.
Benefits of Load Balancing
Reduced downtime
Scalable
Redundancy
Flexibility
Efficiency
Web Server
Once your request reaches a web server at Google, the server receives the necessary resources. A web server is a software and hardware that uses HTTP (Hypertext Transfer Protocol) and other protocols to respond to client requests made over the World Wide Web. The main job of a web server is to display website content through storing, processing and delivering webpages to users. Besides HTTP, web servers also support SMTP (Simple Mail Transfer Protocol) and FTP (File Transfer Protocol), used for email, file transfer and storage.
Web server hardware is connected to the internet and allows data to be exchanged with other connected devices, while web server software controls how a user accesses hosted files. The web server process is an example of the client/server model. All computers that host websites must have web server software.
Web servers are used in web hosting, or the hosting of data for websites and web-based applications -- or web applications.
How do web servers work?
Web server software is accessed through the domain names of websites and ensures the delivery of the site's content to the requesting user. The software side is also comprised of several components, with at least an HTTP server. The HTTP server can understand HTTP and URLs. As hardware, a web server is a computer that stores web server software and other files related to a website, such as HTML documents, images and JavaScript files.
When a web browser, like Google Chrome or Firefox, needs a file that's hosted on a web server, the browser will request the file by HTTP. When the request is received by the web server, the HTTP server will accept the request, find the content and send it back to the browser through HTTP.
Application Server
Applications come in all shapes, sizes, and use cases. In a world where we rely on a host of critical business processes, application servers are the high-powered computers providing application resources to users and web clients.
Application servers physically or virtually sit between database servers storing application data and web servers communicating with clients. App servers and akin middleware are the operating systems supporting an application’s development and delivery. Whether it’s a desktop, mobile, or web app, application servers play a critical role in connecting a world of devices. When application users, be they staff or web clients, request access to an application, the application server often does the heavy lifting on the backend to store and process dynamic application requests.
Why Do We Need Application Servers?
Billions of web clients make HTTP requests every day, expecting instant access to you-name-the-app. Headspace during the morning routine, Google Docs for the extensive report, Twitter during a coffee break, no matter the application in use, it’s being pulled from an application server and delivered via a web server.
Web servers are responsible for serving web clients HTTP requests with HTTP responses. Unlike app servers, the web server design is light enough to process static data requests for multiple applications (or websites) while maintaining security. Dynamic requests, often in the form of applications, require additional assistance.
Application servers optimize traffic and add security
To achieve optimal web server agility, managing both HTTP requests from web clients and passing or storing resources from multiple websites doesn’t work. Application servers fill this gap with a high-powered design built for handling dynamic web content requests.
Application servers also provide program redundancy and an added layer of security. Once deployed between a database and a web server, the job of preserving and duplicating application architecture across the network is more feasible. The additional step between potential malicious web communications and the crown jewels in the database server adds an extra security layer. Because application servers can process business logic requests, an attempted SQL injection is also that much harder.
Organizations can further protect their data with a reverse proxy server positioned in front of their databases. Proxy servers and VPNs can do wonders for anonymizing and encrypting communication to protect users and company data.
How Do Application Servers Work?
Like most servers today, application servers contain features for security, transactions, services, clustering, diagnostics, and databases. Where application servers deviate is their ability to process servlet requests from a web server.
In the above image, we show the general flow for web application servers:
The client opens a browser and requests access to a website
The web server receives the HTTP request and responds with the desired webpage
The web server handles static data requests, but the client wants to use an interactive tool
As a dynamic data request, the web server transfers the request to an application server
The application server receives the HTTP request and converts it into a servlet request
The servlet reaches the database server, and the app server receives a servlet response
The app server translates the servlet response into HTTP format for client access
Upon receiving a servlet request from a web server, the application server processes the request and responds to the web server via a servlet response. Because application servers primarily work with business logic requests, the web server translates the servlet response and passes an HTTP response accessible to the user.
Database
A database is an organized collection of structured information, or data, typically stored electronically in a computer system. A database is usually controlled by a database management system (DBMS). Together, the data and the DBMS, along with the applications that are associated with them, are referred to as a database system, often shortened to just database.
Data within the most common types of databases in operation today is typically modeled in rows and columns in a series of tables to make processing and data querying efficient. The data can then be easily accessed, managed, modified, updated, controlled, and organized. Most databases use structured query language (SQL) for writing and querying data.
What is Structured Query Language (SQL)?
SQL is a programming language used by nearly all relational databases to query, manipulate, and define data, and to provide access control. SQL was first developed at IBM in the 1970s with Oracle as a major contributor, which led to the implementation of the SQL ANSI standard, SQL has spurred many extensions from companies such as IBM, Oracle, and Microsoft. Although SQL is still widely used today, new programming languages are beginning to appear.
What is a database management system (DBMS)?
A database typically requires a comprehensive database software program known as a database management system (DBMS). A DBMS serves as an interface between the database and its end users or programs, allowing users to retrieve, update, and manage how the information is organized and optimized. A DBMS also facilitates oversight and control of databases, enabling a variety of administrative operations such as performance monitoring, tuning, and backup and recovery.
Some examples of popular database software or DBMSs include MySQL, Microsoft Access, Microsoft SQL Server, FileMaker Pro, Oracle Database, and dBASE.
SOURCES: