In the previous article we discussed how a user sends a request to a server by using URLs in a browser. Now we will explore what happens to that request once it leaves the user’s computer, how it travels through the Internet, and how it is protected.
Together with advances in transportation, the Internet made the world a much smaller place. As long as there is an Internet connection, your content or product can be accessed and purchased by anyone, anywhere.
Technically, the Internet is a network of networks communicating via TCP/IP. Networks located within a country are then connected to other countries through the use of submarine cables. These cables, laid deep in the ocean, carry Terabytes of information per second.
Internet data can also be transmitted through satellites orbiting the earth. These satellites are in geostationary orbit, meaning they stay in the same location relative to earth as it rotates. Requests are then routed through ground stations that are connected to the Internet.
One advantage to using satellite technology is that it allows for greater portability and mobility. However, this technology is still in its early stages and suffer greater latency and interference compared to traditional cables. Satellite connections also have lower throughput generally (in the hundreds of Megabytes) compared to submarine cables.
Requests that browsers make often travel directly from the source location towards the server via these cables or satellites. For additional security (or convenience), there are networking techniques that are used to connect to servers within the Internet.
A Virtual Private Network (VPN) is used to transmit data over another network. This enables users to access network resources that normally can’t be accessed from the Internet. Let’s say you are working remotely and need to connect to your company’s network to fetch some documentation. Usually you cannot even load your company’s internal website unless you turn on your company-provided VPN.
A common usage of “VPN” is when you use a service to mask your actual IP address. Consumers use this for streaming and accessing region-locked content. While these services use a VPN to achieve the desired result, it is technically more correct to call them a VPN service instead.
There is another special way to connect to the Internet called The Onion Router (TOR). A TOR network is composed of a volunteer overlay network of nodes that encrypts your request and relays it to multiple nodes (scattered around the globe) before it reaches the server. This makes it difficult for someone to determine where the request comes from and thus can greatly increase your privacy while browsing the Internet.
TOR allows you to access a special TLD called onion. URLs using this special TLD cannot be accessed via a normal Internet connection. While TOR increases privacy protection, one drawback is its speed. As the request is being passed on to multiple nodes, it is noticeably slower than when browsing normally.
Browsers send and receive data using HTTP. By default, this data is passed through the network in plain text, meaning that someone who can intercept the requests can see the data being sent. This is problematic especially when security is required, such as when entering passwords and processing financial and other confidential information.
The solution is to encrypt these requests using Transport Layer Security (TLS) which results in what is called HTTPS (secure HTTP). The main reasons why we use HTTPS are:
- Privacy – our data is sent through the Internet without any malicious agent being able to read our data
- Integrity – the data cannot be tampered or changed in transit
- Identification – we can be certain that the servers we are communicating with are who they say they are
TLS or SSL?
Let’s backtrack a bit and talk about TLS and SSL. In the early days of the Internet, requests are encrypted using the Secure Sockets Layer (SSL) protocol. SSL 2.0 and SSL 3.0 were introduced to the public but were eventually deprecated due to its vulnerabilities. The latest version, SSL 3.0, was deprecated in 2015 due to security flaws.
When the next version is released, it was not called SSL 4.0, but instead was renamed to Transport Layer Security (TLS) 1.0. As of the time of this writing, TLS 1.0 and 1.1 are now deprecated, and TLS 1.2 and 1.3 are the new standard.
Thus, TLS is the correct technical term for the protocol used in HTTPS, not SSL. We also refer to SSL Certificates as TLS Certificates instead to use the correct terminology.
Symmetric and Asymmetric Encryption
In order to understand how an HTTP request gets encrypted, we need to understand first the concept of symmetric and asymmetric encryption.
Symmetric encryption is a process that uses a single key to both encrypt and decrypt a message. Think of it as a padlock that you use at home. There is one lock and one key to unlock it. If you lose the key, you will no longer be able to unlock it. If someone else stole your key, then they will be able to unlock it anytime.
In the context of a web request, this is the most obvious way to encrypt a message, because of its simplicity and speed. However, this poses an important question: how can you share the key between the browser and the server securely? We can’t send the key itself to the server when making requests as a malicious agent can just intercept it during transit and compromise the entire connection.
Asymmetric encryption solves this problem by having two keys instead of just one: a private key (which must be kept secret), and a public key (which can be shared with anyone). The basic steps when performing an asymmetric encryption is:
- The sender has the receiver’s public key, while the receiver has the sender’s public key.
- An encrypted message is generated by the sender using the receiver’s public key
- The sender sends the encrypted message (called ciphertext) to the receiver
- The receiver receives the ciphertext and decrypts it using the receiver’s private key.
Using this approach, both sender and receiver does not need to know each other’s private keys to encrypt and decrypt the message. Also, the message cannot be decrypted using the public key alone, preventing anyone from intercepting and compromising the message being sent.
How does HTTPS work?
The reason why it is important to understand the concepts of symmetric and asymmetric encryption is that HTTPS uses both of them to establish communication between a browser and a server. In general:
- The browser and the server uses asymmetric encryption to generate a shared common key for the session
- This shared key is then used to encrypt data between them using symmetric encryption
Here is how this initial communication (called a TLS handshake) is performed in more detail:
- The browser accesses the server using its IP address (obtained using DNS).
- The server sends its certificate information (TLS Certificate) to the browser. This information contains the public key of the server.
- The browser checks the certificate to see if it is valid.
- Using the server’s public key from the TLS certificate, the browser generates an encrypted pre-master key and sends it to the server.
- The server decrypts the pre-master key using its private key.
- The server generates a shared key from the pre-master key. The browser also knows this shared key.
- Now there is a private, shared key known by both the browser and the server. They use this key to encrypt requests throughout the session.
We can see from the steps that a TLS certificate plays a key role in securing web requests and the Internet in general. In the next article, we will explore the purpose of these certificates, how they are made, and how you can start using them as well.