Diving into HTTP/3: Security, congestion control, and TCP comparisons on QUIC protocol
By Pau Sabates, Cloud Engineering
HTTP/3 is becoming more and more a popular term. Companies like Uber, Facebook and Google are using it on some services with latency improvements. Some others, such as Cloudflare, have also written articles about research and benchmarks they have been doing, and some are even developing their own open source protocol implementation. Is HTTP/3 becoming the new standard? Which benefits does it bring compared to its predecessors HTTP/1.1 and HTTP/2? And most importantly, what does HTTP/3 actually mean? How does it work and why do we need it?
The answer to this question does not reside on the network application level. Instead, we can get deeper understanding on the new transport layer protocol, QUIC.
In this article we’ll take a close technical look into this protocol and how it handles congestion control, security and performance challenges in order to be the next standard.
Context and current HTTP (TCP) problematic
Without going in detail on the history of previous HTTP protocol versions (for this you can take a look for example at this article from a colleague at Edrans that sums up the HTTP evolution), HTTP/2 was a great improvement on the protocol but we’ve seen that it still has a lot of inefficiencies and problems. And the first thing we want to address is that these problems are actually TCP problems, so HTTP/3 tackles those challenges by changing the transport protocol. We’ll talk in detail about this new QUIC protocol, but first, let’s do a quick review on some of the most important TCP problems.
The main one is related to the origin of the protocol as it was not designed to transport multiple files over a single connection. Given that, we have the packet level Head-Of-Line blocking of the TCP stream due to TCP seeing all the data as a single byte stream, so if any packet is lost then all others have to wait until that one is recovered. HTTP/2 did solve this issue, but only at the application layer level, not at a packet/transport layer level. We’ll dive deeper into that later.
Another clear and known problem on the performance side is the initial TCP connection handshake (SYN-ACK) required to establish a connection, where if we add the TLS handshake we will require at least on round trip extra (probably two, if we are using a version prior to TLS 1.3) and then send our request, so basically we will end up with three round trips that maybe, on a slow network, are 100 ms per round trip. We’ll also talk about this as well as analyzing some security risks.
Another issue that is worth mentioning is the difficulty to evolve the TCP protocol itself. Over the years, we have improved performance with new HTTP versions with features like pipelining or multiplexing, also using CDNs to reduce geographic latency, and also we tried to improve TCP itself for example with ‘multipath TCP’ and ‘TCP fast open’ to deal with the handshake overhead. Unfortunately, updating and improving the protocol is not easy due to its difficulty to deploy new versions as it’s a protocol not only used by end users, but by many network devices such as load balancers, routers and proxies. It will take decades before all their TCP implementations become updated.
So, we were in a situation where we had to develop a new protocol in order to solve all TCP problems explained before, and at the same time should be fast and easy to adopt on the current internet network. And most importantly, without compromising security. This gave birth toQUIC, a new transport protocol, which could actually be directly connected to layer 3, but instead it sits on UDP, which is supported by all devices and doesn’t have some of the limitations we have on TCP.
Questions about how QUIC handles packet loss on UDP and how it integrates with TLS for instance? This and more is explained in the following sections.
QUIC packets on UDP datagrams
Before starting to get deeper on QUIC, bare in mind that the protocol does not use UDP in order to make HTTP/3 faster; instead this is used to allow easy adoption on current networks, which in turn will allow us to have some performance improvements compared to TCP:
As we see in the figure 1, QUIC runs on UDP so both are actually transport protocols but QUIC will be the one making sure the connection is secure, reliable and handling all congestion network negotiation. QUIC is by default encrypted using TLSv3, meaning that all HTTP/3 connections will have to be HTTPS. As we can see in the figure 2 it will look something like that compared to TCP datagrams:
QUIC encrypts almost all of its packet header fields as well as the payload. The transport-layer information such as packet numbers and header flags, which are never encrypted for TCP, is no longer readable by intermediaries in QUIC. Also, as it is fully encrypted, we can easily deploy new protocol versions, as we only need to update the endpoints and not all the proxy middleboxes. On the other hand that encryption may have performance issues on high-throughput scenarios as QUIC encrypts each individual packet and on TCP we can encrypt several packets at the same time using TLS.
From a security point of view though, using UDP can be problematic because, even though most network devices can parse and understand UDP, it’s not widely allowed due to being often used for attacks and also it’s not used normally on most companies applications. Allowing that protocol on the port that will run the HTTP/3 application might involve changing corporations security policies and creating new firewall rules like allowing only QUIC-over-UDP and only 1-RTT packets (we will talk about that later). So at the end, firewalls will have to be adapted to allow QUIC connections securely and that could require time and knowledge (although I’m sure cloud providers will find a way to make it easier ;) ).
Approaching HOL blocking problem in the packet level
As we see in both pictures commented above, QUIC payload uses frames as a basic unit of communication. Reusing the idea from HTTP/2 where we had a stream in a TCP connection, now we open multiple streams connections to the server, each of these QUIC streams is identified by an unique ID and are delivered using byte-stream ordering as in TCP. That stream is composed of frames, which can be compared as TCP segments.
All the reliability logic is managed by QUIC: the frames can be packaged into one QUIC packet even if it comes from different streams and also QUIC endpoints can decide how to allocate bandwidth between different streams and prioritize them. So the most noticeable advantage of having multiple streams is dealing with packet Head Of Line blocking (HoL).
The Application level Head Of Line blocking problem appears as we can only do an HTTP request at a time, resulting in a lot of waiting time for example in web pages that require a lot of connections to get all its resources. It’s also really inefficient because HTTP is stateless so each connection will require the handshake and TLS overhead every time. With HTTP/1.1 we can actually use ‘pipelining’ which allows us to do multiple HTTP requests at the same time, but you have to take care of the order of those requests and it’s not widely supported. We commented before that with HTTP/2 we solved HoL problem but only at application (HTTP) level, this is achieved multiplexing an HTTP connection, we do multiple requests over a single TCP connection and we avoid waiting for an HTTP connection to finish to establish new ones. The problem is that if one of the TCP packets on that connection is lost, that TCP connection has to still wait until the packet is recovered.
So for example, we’ll suppose we request 3 different html files multiplexing the byte streams on a TCP connection (which we’ll call them X, Y, Z respectively for ease of writing). We’ll also do the paper of window size negotiators over our network and we will suppose that each file will be divided into 10 packets.
So we’ll probably have a TCP connection with 5 packets from X, 5 from Y, 5 from Z, then 5 from X again, and so on. Basically we multiplexed the files over a single TCP connection, we are receiving data from 3 different files over a single TCP connection established. So if every one of these packets is not lost, we’ll receive the 3 different files over a single TCP connection, awesome! But what if we lost, for example, the last packet of Z. Then we have a problem, you might think that we already have all the packets for X and Y so we can start processing them on the client side while we are waiting for the lost packet of Z, but it’s not possible, we have to wait for that packet because TCP doesn’t know which packet belongs to X,Y or Z, TCP just transmit a datagram payload of bytes so until we successfully sent all packets we cannot process them. This is the HoL blocking at a TCP or packet level.
With QUIC we can create a byte stream per resource, so in that example we will have three different streams and each one will perform its packet loss detection and recovery logic, so X and Y streams will be able to be processed and Z stream will be blocked without affecting the other resources requests.
So this is good, we can request multiple resources over a single QUIC connection, but can we go further and improve now the efficiency of establishing that connection? As in TCP, there is an initial handshake, so let’s compare it.
Connection handshake and 0-RTT benefits (or… not)?
At the beginning when we were exposing some TCP problems we talked about the initial handshake for establishing a TCP connection that can require up to three round trips including the TLS handshake. The truth is that this is in case we are not using TLS 1.3, then it’s decreased to two round trips. But the interesting part is that in QUIC due to have integrated the encryption it’s not required to negotiate about encryption versions and so on, so we only need one road trip:
We certainly have an efficiency improvement reducing latency time avoiding handshake round trips but, to be honest, where we will truly see that benefit is under slow or inestable networks.
A common security attack with the handshake connection is the SYN flood attack, where in order to do a DoS attack a lot of SYN requests are sent to the server and the attacker ignores the SYN-ACK response and do not send back the ACK, instead sends more SYN packets, each request remains open on the server and it keeps receiving more and more requests consuming all the server resources.
On TCP we have multiple ways to try to mitigate it but the one known as SYN Cookies might be the most similar one to the current QUIC mitigation technique. Consists of the server sending the SYN-ACK response with a sequence number unique for client IP address, port number, and other identifying information. If the client answers the ACK packet with that identifier the server will therefore allocate memory resources for the connection.
On QUIC the current mitigation implementation is called Retry Token and it’s basically the same idea but generating a token that will be sent to the client. That mechanism though should only be applied under attack circumstances as it can affect performance.
To improve the efficiency even more, there is even an alternative that’s actually called zero round trip (0-RTT) where the only round trip will be the HTTP data response. The idea is to reuse the first request with the connection and TLS handshake and send it directly with your current HTTP request, doing so we have 0-RTT for establishing the connection. We might think that HTTP/3 then can give us a lot of performance improvement over latency but…in most cases it will be unnoticeable. For example compared with an HTTP/2 with TLS1.3 on a current fast network it’ll be 50 ms faster maybe just for saving that additional round trip, so in that case it’s probably not worth the change of a whole protocol communication for your application, but in the case it’s a slow network or geographically far from the server, it is surely more worth.
From a security point of view though, HTTP/3 0-RTT can lead to some attacks that are already solved over TCP and for QUIC we do have some mitigations but still need a lot of research and improvements. For example the first security concern was an amplification attack through IP spoofing.
This is a type of denial of service attack that because the point of 0-RTT is to avoid that handshake that QUIC does commented above (like TCP does by default) an attacker could spoof the client IP and send a lot of requests to the server, then, because the server does not verify the integrity of that connection, sends all the HTTP responses to the spoofed client. The mitigation QUIC currently has for this is to only respond three times the packets sent by the client, so all the other packets left will have to wait until the client sends something back, proving that it is the real client. Also for the mitigation it is established that the responses will be max 6KB of data so data will be splitted and it will require more round trips. So as we can see the performance improvement is not really real if we have to rely on the non 0-RTT communication and we have a certificate grater that 6Kb we will end up with the same round trips as TCP+TLS.
There is one more important attack related to 0-RTT, the replay attack, which an attacker can copy and send over one of those fully encrypted packets to the server (imagine an e-commerce transaction request packet…) This is why Cloudflare for instance limited the 0-RTT requests to only GET without parameters requests, the thing is that we are limiting the power of possible performance improvements on QUIC.
Congestion Control, packet loss and reliability
Of course if we are analyzing the performance on the latency as we did before we have to also make sure that the protocol is efficiently managing the bandwidth on the network and avoiding packet loss too.
Congestion control is a big complex topic. An algorithm that can be efficient in all network situations in order to give a fair bandwidth to every device is in continuous research, therefore I’ll not enter in more details as it can be a whole new article.
But just as a snipped, on TCP we never know how much bandwidth we can use when starting a connection, so as a client we start sending a very lightweight small number of packets and see if we receive the acknowledge of them, if there is no packet loss we can send at that rate and keep increasing the rate until we start detecting loss. This applies on any network involved on the trip of those packets (think of 4G, server network serving multiple clients, etc) so any congestion somewhere will lead to receiving less number of packets at a time and so the file download or webpage load will be slower.
On UDP protocol although, as it’s commonly known for not having to care about if some packet is lost we could send packets at a higher rate no matter the congestion as the idea is to receive that packet fast. You might think that QUIC will then work like that as it’s based on UDP but it’s more similar to TCP actually because we want the reliability to receive all packets in order to work on HTTP. So how does QUIC achieve that?
QUIC uses basically a really similar mechanism as the one I mentioned above for TCP, sending small lightweight packets and keep increasing that rate if there is no loss, but the main difference is that the QUIC implementation runs on user space in the operating system, whereas on TCP runs on the kernel’s operating system, meaning that congestion control algorithms can be open sourced and can be evolved and improved easily and faster. On the other hand though, this can lead to some performance inefficiencies as it involves copying extra packet data from kernel memory to user memory, so research on improving that efficiency is key.
Some current QUIC implementations
Taking a look at the current implementations on QUIC we have ones still experimental such as the implementation on Nginx, but we have for instance some ones developed in Golang such as Quic-go, and ones in Rust such as the Cloudflare one named Quiche and the recently announced from AWS s2n-quic, so we’ll probably will have soon HTTP/3 running on AWS Cloudfront and maybe on AWS ELB too, very exciting!
To sum up, we have seen that HTTP/3 improvements will exist if QUIC does better in performance and security than TCP. We have seen that even if most of the current situations might not be still worth it to adopt, it can be a game changer on certain big companies like Google, Facebook and Uber, but also important on delivering to slow and inestable network areas. For some other cases such as datacenter operations or on some factories running TCP sockets directly might be worth the kernel performance advantage at the moment. For some other companies the firewall and security concerns might be a problem for system administrators if they have to start handling 0-RTT packets separately and so on as we commented for the security attacks.
On the whole, in my opinion, QUIC is very promising as I think that with our current HTTP implementations was more and more rough to keep scaling due to TCP limitations, so we need to keep researching (taking advantage for being open source and running on the user space level) on solving the current concerns we’ve seen in order to to make it truly more efficient than TCP and useful for more scenarios.
In case you want to go deeper on HTTP/3 I leave here some awesome articles that are really worth checking as well as the RFC paper. Thanks for reading!