HTTP/2 deep dive
The Internet as we know it would not have been possible without HTTP (Hypertext Transfer Protocol). HTTP has come a long way since its inception when it started as a simple one-line ASCII string protocol. HTTP/2 is a significant version upgrade over it predecessor HTTP/1.1
This article covers the limitations of HTTP/1.1 and how those limitations led to the protocol’s evolution to HTTP/2. The article delves into the technical concepts behind HTTP/2 and touches upon some of its own limitations.
Humble Origins – A Brief History of HTTP
HTTP is an application protocol that dates back to 1989. Tim Berners-Lee, inventor of the World Wide Web, initiated its development. HTTP/0.9 is the first documented version of HTTP. HTTP/0.9 is such a simple protocol that it was called the one-line protocol. Despite being simple, it served its purpose of bootstrapping the World Wide Web.
HTTP/0.9 requests consist of the method `GET,` followed by the document address, an optional port address, and terminated with a carriage return and a line feed. A request is just a string of ASCII characters. HTTP/0.9 was very limited. It didn’t have HTTP headers so only HTML files could be transmitted. It also didn’t have status and error codes.
Browsers and servers independently modified HTTP/0.9 to overcome its limitations. Some of the key protocol changes were:
- Requests may consist of multiple header fields separated by newline characters
- The server response consisted of a status line
- A header field was added to the response. The response header object consisted of newline-separated header fields
- Servers could respond with files other than HTML
The many flavors of HTTP/0.9 in the wild caused interoperability issues. HTTP Working Group published HTTP/1.0 (RFC 1945) in 1996 to resolve these issues. It was an informational RFC, so technically, HTTP/1.0 is not a formal specification or an Internet Standard.
The work to develop HTTP into an Internet standard happened over four years, between 1995 and 1999. HTTP/1.1 was first defined in RFC 2068, which was released in January 1997. Improvements and updates were released under RFC 2016 in May 1999. HTTP/1.1 introduced many feature enhancements and performance optimizations like:
- Persistent and pipelined connections
- Virtual Hosting
- Content negotiation, chunked transfer, compression & decompression, transfer encoding, and caching directives
- Character set and language tags
- Client identification and cookies
- Basic authentication, digest authentication, and secure HTTP
What’s the problem with HTTP/1.1?
Web applications are complex and growing. As per HTTP Archive, the median size of the desktop version of websites grew from 1.51MB to 1.97MB (+30.6%) between November 2016 and November 2017. During the same period, the median size of the mobile version of the websites grew from 1.34MB to 1.78MB (+32.3%).
Websites should load in one second or less to keep the user’s flow of thought uninterrupted. However, the median time for first contentful paint (FCP) for desktop was 2.4 seconds in November 2019. For mobile devices, the median time for FCP was 6.4 seconds.
Much of this is down to the limitations of HTTP/1.1. Let’s look at some limitations and developer hacks for overcoming them.
Head of Line Blocking
When a client sends a request to a server on a connection, that connection is useless till the server sends back a complete response. Subsequent requests cannot use the TCP connection. This situation is called head of line blocking.
Initially, browsers were allowed only two concurrent connections to a server, and the browser had to wait till one of them was free. To solve this bottleneck browsers were allowed to make six concurrent connections. This simply postponed the problem instead of solving it.
Developers started using domain sharding to make their website load faster. But domain sharding increases the complexity of development and underutilizes TCP connections.
HTTP/1.1 introduced a technique called pipelining. In theory, clients could send multiple requests to the server without waiting for the response. Although the server must respond to the requests in the order they are received. For pipelining to work, both the client and the server should support it.
However, a large or slow response can break pipelining by blocking the responses behind it. Pipelining was challenging to implement because many intermediaries and servers didn’t process it correctly.
Protocol Overheads – Repetitive headers & cookies
TCP Slow Start
Evolution of HTTP/2
Simplicity was a core principle of HTTP. However, implementation simplicity came at the cost of performance. HTTP/1.1 has many baked-in limitations, which we discussed in the previous section. Over time these limitations became too much and HTTP needed ‘fixing’.
Google announced SPDY, an experimental protocol, in 2009. This project had the following high-level goals:
- To target a 50% reduction in page load time
- To minimize deployment complexity
- To avoid the need for any changes to content by website authors
- To bring together like-minded parties interested in exploring protocols as a way of solving the latency problem
In November 2009, Google engineers announced that they had achieved 55% faster load times. By 2012 SPDY was supported by Chrome, Firefox, and Opera.
The HTTP Working Group observed this trend and initiated an effort to improve upon the learning from SPDY. In November 2012, a call for proposal was made for HTTP/2, and SPDY specification was adopted as the starting point.
Over the next few years, HTTP/2 and SPDY coevolved with SPDY as the experimental branch. HTTP/2 was published as a standard in May 2015 as RFC 7540.
HTTP/2 under the hood
Let’s take a look at how HTTP/2 works. HTTP/2 is an extension, not replacement of HTTP/1.1. It retains the application semantics of HTTP/1.1. Functionality, HTTP methods, status codes, URIs, and header fields remain the same.
Every HTTP/2 connection starts as HTTP/1.1 and the connection upgrades if the client supports HTTP/2. HTTP/2 uses a single TCP connection between the client and the server, which remains open for the duration of the interaction.
Binary framing layer
HTTP/2 introduces a new binary protocol called the binary framing layer. The binary framing layer is responsible for all performance enhancements in HTTP/2, laying down the protocol for encapsulation and transfer of messages between the client and the server.
The binary framing layer breaks the communication between the client and server into small chunks and creates an interleaved bidirectional stream of communication. Thanks to the binary framing layer, HTTP/2 uses a single TCP connection that remains open for the duration of the interaction. Let’s look at some terminology used in HTTP/2 before moving on.
A frame is the smallest unit of communication in HTTP/2. Frame carries a specific type of data like HTTP headers or payloads. All frames begin with a fixed 9 octet header. The header field contains, amongst other things, a stream identifier field. A variable-length payload follows the header. The maximum size of a frame payload can be 2^24 – 1 octets.
A message is a complete HTTP request or response message. A message is made up of one or more frames.
- A stream is a bidirectional flow of frames between the client and the server. Some important features of streams are:
- A single HTTP/2 connection can have multiple concurrently open streams. The client and server can send frames from different streams on the connection.
Streams can be shared by both the client and server. A stream can also be established and used by a single peer.
- Either endpoint can close the stream.
- The order of frames on a stream is important. Receivers process frames in the order they are received. The order of headers and data frames has semantic significance.
- An integer identifies streams. The endpoint initiating the stream adds the identifying integer. is assigned by the endpoint which initiated the stream.
The binary framing layer solves the problem of head of line blocking, making HTTP/2 fully multiplexed. The binary framing layer breaks the communication between the client and server into small chunks and creates an interleaved bi-directional flow of streams.
Either peer (the client or the server) can specify the maximum number of concurrent streams the other peer can initiate. A peer can reduce or increase this number anytime.
At this point, you must be wondering why multiplexing over a single TCP connection won’t result in head-of-line blocking.
HTTP/2 uses a flow-control scheme to ensure that streams are non-blocking. Flow-control is simply an integer that indicates the buffering capacity of a server, a client, or an intermediary (like a proxy server). Flow control is used for individual streams and the connection as a whole. The initial value for the flow-control window is 65,535 octets for both new streams and the overall connection.
A peer can use the WINDOW_UPDATE frame to change the value of octets it can buffer. Server, clients, and intermediaries can independently advertise their flow-control window and abide by the flow-control windows set by their peers.
HTTP/2 protocol allows multiple requests and responses to be chopped into frames and multiplexed. The sequence of transmission becomes critical for performance. HTTP/2 uses stream prioritization to tackle this issue. The client creates a prioritization tree using two rules:
- A stream can depend on another stream. If steam is not given an explicit dependency, then it depends on the root stream. Streams have a parent-child relationship.
- All dependent streams may be given an integer weight between 1 and 256.
Parent streams are delivered before their dependent streams. Dependent streams of the same parent should get resources proportional to their weights. However, dependent streams that share the same parent are not ordered with respect to each other.
A new level of dependency can be added by using the exclusive flag. The exclusive flag makes the stream the sole dependent of its parent stream. Finally, stream priorities can be changed using the PRIORITY flag.
HTTP requests contain headers and cookies data, which add performance overhead. HTTP/2 uses the HPACK compression format to compress request and response metadata. Transmitted header fields are encoded using Huffman coding. HTTP/2 requires the client and the server to maintain and update an indexed list of previously seen header fields. The indexed list is used as a reference to encode previously transmitted values efficiently.
Server push is a performance feature that allows a server to preemptively send responses to a client ahead of time. This feature is useful when the server knows that the client needs the ‘pushed’ responses to process the original request fully.
Server push has the potential to improve performance by utilizing the network, which otherwise would be idle. However, server push can be counterproductive if not implemented correctly.
Pushing resources that are already in the client’s cache wastes bandwidth. Wasting bandwidth has an opportunity cost because it could be used for relevant responses. Cache Digests for HTTP/s is a proposed standard that addresses this issue. In the meantime, this issue can be tackled by pushing on the first visit only and using `cache-digest`.
Pushed resources compete with the delivery of the HTML, adversely impacting page load times. Developers can avoid this by limiting the size of assets pushed.
Differences between HTTP/2 and HTTP/1.1
Let’s look at some differences between HTTP/2 and HTTP/1.1:
- HTTP/1.x is a text protocol while HTTP/2 is a binary protocol
- HTTP/1.x can be without encryption, but HTTP/2 is mostly encrypted. The HTTP/2 protocol does not force encryption, but most clients require encryption.
- In HTTP/2, negotiation happens through an “Upgrade” HTTP header that is sent by the client. The negotiation can also happen during the TLS handshake using ALPN.
HTTP/2 uses streams for multiplexing Streams can be prioritized, re-prioritized, and canceled at any time. Streams have individual flow control and can have dependencies. It is more efficient than pipelining used in HTTP/1.1.
- HTTP/1.x headers are transmitted in plain text, but HTTP/2 compresses headers.
- Servers can push content to the client that has not been asked by the client.
- In HTTP/1.1, the client cannot retry a non-idempotent request when an error occurs. Some server processing may have occurred before the error, and a reattempt can have an undesirable effect. HTTP/2 has mechanisms for providing a guarantee that a request has not been processed.
Currently, all browsers support HTTP/2 protocol over HTTPS. Four serving an application over HTTP/2, the web server must be configured with an SSL certificate. You can quickly get up and running with a free SSL certificate using Let’s Encrypt.
All major server software supports HTTP/2:
- Apache 2.4.12 supports HTTP/2 via mod_h2. However, patches must be applied to the source code of the server. As of Apache 2.4.17, HTTP/2 is supported via mod_http2 without the need for adding patches.
- NGINX versions 1.9.5 and above support HTTP/2. It supports Server Push since version 1.13.9.
- Microsoft IIS supports HTTP/2 in Windows 10 and Windows Server 2016.
HTTP/2 allows the client to send all requests concurrently over a single TCP connection. Ideally, the client should receive the resources faster. On the flip side, concurrent requests can increase the load on the servers. With HTTP/1.1, requests are spread-out. But with HTTP/2 servers can receive requests in large batches, which can lead to requests timing out. The issue of server load spiking can be solved by inserting a load balancer or a proxy server, which can throttle the request.
Server support for HTTP/2 prioritization is not yet mature. Software support is still evolving. Some CDNs or load balancers may not support prioritization properly.
HTTP/2 push feature can do more harm than good if not used intelligently. For example, a returning visitor may have a cached copy of files, and the server should not push resources. Making server push cache-aware solves this problem. However, cache-aware push mechanisms can get complicated.
HTTP/2 has addressed the HTTP level head of line blocking problem. However, packet-level head-of-the line blocking of the TCP stream can still block all transactions on the connection.
HTTP/2 is optimized for speed, which makes for a better user experience. Using HTTP/2 can also lead to developer happiness because hacks like domain sharding, asset concatenation, and image spriting are no longer necessary, which lowers the development complexity.