Graduate Student in USC
Network Protocols: What has happended when you type "http://www.linkedin.com"
Mar 29, 2018 / Written by Chengyu
Thanks to the powerful search engines, like Google, Bing and Baidu, cyber-citizens find it is quiet convenient to search needed information by simply typing some keywords in the search bar. However, do you know what has happened when you type a famous link, such as “https://www.linkedin.com”? This article explores the “magic” behind the web browser and afterward, you are about to have a better understanding on how to get corresponding context when you type a URL.
OSI Model and TCP/IP Protocol Layers
The Open System Interconnection model (OSI) is a conceptional model which specifies the details about telecommunication without using any internal structure and technology. In this article, we will explain the process of internet communication through OSI Model Layers. To simplify, the TCP/IP protocol Layers is easier to understand. Instead of 7 different layers, TCP/IP combine the Application Layer, Performance Layer and Session Layer as Application Layer (As shown in the Figure.1) so that there is only 5 layers in TCP/IP Protocol Layers.
Figure.1 Telecommunication Layers
Processes:
This article separates the process by TCP/IP Protocol Layers.
First Step: Application Layer:
When you type the “https://www.linkedin.com” in an browser, we start from the Application layer. It is obvious the protocol used in the application layer is https, However. we couldn’t send the request to remote server, because we have not established the connection to the LinkedIn server at this moment.
Second Step: move down to Transport Layer:
In order to establish connection, we move top-down and go to the Transport Layer. Since the https is only for declaring protocol used in Application Layer, it is safe to remove it in the lower layer. Currently, we tend to make a TCP connection with LinkedIn Server(s). Nevertheless, the host-to-host connection requires IP address of both side, it is okay for us to know one peer IP, namely us, but the LinkedIn Server(s) addresses are unknown, right?
Third Step: move down to Network Layer:
Thus, DNS (domain name system) helps us find the IP address from the domain name. You can treat it as a database whose entities are key-value pairs, which key is the domain name(url) and value is its IP address(es). This kind of pair is different from HashMap, in which you can merely get value from key but not key from value. If you have terminal in your computer, or some Linux-like command line interface, like PuTTy, you could type “nslookup www.linkedin.com” to perform the DNS service.
Even though it seems that internet needs to waste some time for searching the IP address and someone may argue that why don’t use the IP address straightforwardly, DNS is a user-friendly design in which all the domain name are some meaningful strings, like LinkedIn or Google. DNS resolve the problem that IP address is usually random numbers and it is less difficult for users to remember domain name than IP address.
Owning to some smart browsers, they help us to get the IP address from cache before sending requests to DNS servers:
It is acknowledge that more information stored in cache, the more dangerous your information is. However, using cache is quiet prevent in most company since it is able to improve data transfer time.
Figure.2: Domain Hierarchy
Unfortunately, your browser cannot find the IP address from any cache, so, it decides to send the DNS query to DNS servers. DNS domain name space is a tree structure which illustrates in the figure.2. This tree structure increase DNS lookup efficiency and it works following this step (Figure.3)
Figure.3: DNS Request Process
Forth Step: move up to Transport Layer:
Now, we have the IP addresses for both peers so that we could establish connection between them! TCP/IP three-handshake is the key process for TCP connection establishment, which is a three step process where client send messages with TCP header (only SYN/ACK are utilized in this step).
Fifth Step: move up to Application Layer:
Based on the TCP connection, you can transfer data with The LinkedIn Server now! The GET request request is utilized for HTTP/S protocol, you can see the detail in this.
Congrats, I believe you have a better understanding about the mechanism behind browser when you type “https://www.linkedin.com”!
Add on:
What we have discussed in the above is based on an assumption: you want to connect to the LinkedIn Server and transfer data with it via internet. However, if your IP address is in the same local network as LinkedIn, you don’t need to do the above things. Instead, you will use Data Linked Layer, in which ARP helps you get the MAC address of LinkedIn Server(s) and you can communicate with it straightly via MAC address.