HTTP/2 What, Where, Why and When?!

Patrick Hamann speaking at Front-End London in August, 2016
559Views
 
Great talks, fired to your inbox 👌
No junk, no spam, just great talks. Unsubscribe any time.

About this talk

The onset of HTTP/2 is upon us. Now over 75% of your users browsers support the low-latency transfer protocol, yet adoption has been slow.


Transcript


[00:00:08] My name is Patrick Hamann, you can catch me there on Twitter @PatrickHamann. I’m always up for a rant about performance or a chat. Come and grab me in the pub afterwards if you’re not on Twitter or anything. Please talk, it’s good to talk. If you haven’t guessed it already, I live and work here in London at the Financial Times. We are the world’s leading news organisation for business and economic affairs, where amongst other things I’m helping to rebuild the next generation of FT.com. We’re redefining how the FT delivers news, creating an immersive and personalised news experience for our users. One of the core remits when we’ve been building this product is that we wanted to make it extremely fast. We’ve done a lot of research and know that speed correlates directly with engagement of our users and therefore people coming back to our site reading more news, subscribing more, giving us more money and paying my wages. I’m not actually here today to talk about the product development of that. I’d love to be able to talk to that, but I’m here to talk to you about how we’re making it as fast as possible and more specifically, one of the underlying technologies that we’re using to deliver the news as fast as possible. [00:01:20] Hopefully, this may look slightly familiar to many of you in the room, especially if you’re a developer, since their inception, webpages have been delivered over the network over the same way for tens and tens of years. We have TCPIP at the bottom, HTTP at the top, and we use the transfer protocol to deliver our HTML, CSS, and JavaScript that form the webpages that you build on a daily basis and your users interact with. As application designers and developers, we’ve rarely had to understand even how the bottom half of this stack fits together and how it communicates with the top layer. By in large, that’s remained the same for about 20 years now, but this is beginning to change and it’s quite an exciting time because of that and that’s, again, what I’m here to talk to you today. [00:02:09] For the first time in nearly over 20 years, we now have a new version of the underlying transfer protocol of the internet, the protocol that all our websites use to deliver their assets. The simplicity of the HTTP protocol, the reason why it’s been so long, right, it was actually from its initial design, from Tim Berners-Lee colleagues at Cern, such a well-designed specification that it didn’t really need to change that much. It’s actually lasted us so well. We now have fridges, watches, even cars that communicate with each other and the servers around the world using this transfer protocol. You might find that a bit scary that cars are talking to each other over HTTP but I think it’s an amazing testament to the protocol and why it’s actually survived so long. As things get older, most things start to show some signs of stress. To meet these new challenges, the web of 1991 is very different to the web of 2016. To meet these new challenges, it’s starting to show some signs of stress and so that’s why in 2012, the HTTP BIZ, which is working group part of the IETF announced a new initiative to creative HTTP 2. It’s taken until now, until last year for that working draft to come out and finalised and that’s why it’s exciting. Literally, last year, we have a .2 version of the underlying transfer protocol of the internet. [00:03:38] This is very much off the back of the work that Mike Bleachy and colleagues at Google did in the SPDY draft. I might talk a little bit about that later if I have time. That’s what I’m going to cover today, is the why, the what, the when, the where. Why do you need to know about HTTP, how does it work underneath, and why you should know about that to help you with your daily basis? Most of you are probably asking – and I’d definitely ask myself this as well – why did we need a new version? I’m still building websites every single day, delivering them, they’re working, our users loving them, coming back, why do we need this? It’s quite common and quite right for you to ask that, but I have to let you into a little secret that we’re all trapped in this lulled sense of false security, that our websites are in fact using the network extremely inefficiently. Why is this? Hopefully the next section is going to explain that. I mentioned earlier that the web of 95’, this is FT.com taken in 95’ is very different from the web of today. Our users are expecting a much more immersive experience. Hundreds of resources, different mediums, images, video, interactions, but all of this is coming at a cost when we deliver it down the network to our users. As you can see, it’s very, very different. That is literally just text on a white – they didn’t have CSS then – this is all just tables and it was probably just one single file, probably two at most. [00:05:08] It’s completely different to what our users are expecting. We have to hire hundreds of people to be able to deliver this now and live up with the rest of the competition that we have in the world. Our average webpages are now making over 80 requests per page, and that is not it. The 95th percentile is more like 300 plus. We are at the peak of a website obesity crisis and the trend as you can see here is not stopping any time soon. Let me just let that settle in, 80 requests. HTTP 1 and 1.1 simply was not designed to cater for the shear amount of request going down the pipe like this. It’s probably just one or five. The first website that Tim Berners-Lee ever made was a single document and it was linking to another one, that was it, just one request. Whilst we’ve seen a great increase in the available bandwidth that people have over the last few years, a lot of us here, we’re very privileged to live in London. We have things like Virgin Media that have cable going into our houses that have now something ridiculous like 200mb a second pipe going into your own living room. Yet, we haven’t seen the same benefits and improvements in latency. Now, what is latency? Latency, essentially, can be boiled down to the time it takes from a request to go from your mobile phone all the way to the server and back again, the round trip time. Mobile networks by their pure nature are very highly latent things. It’s mobile, it’s in the name, you are mobile, you are walking around, it’s going to take a long time. [00:06:48] When Google were experimenting with SPDY, Mike Bleachy, and it’s a great article down here, the slides will be online later, they found that whilst you increased the speed of bandwidth it plateaued. There was a threshold where it no longer had any more impact on your page load time; whereas, for every 20 millisecond improvement in latency, there was a near linear improvement with the page load time of your website. There are many good reasons for this, as we’ve seen it’s 80 requests, the average page is composed of many small resources, which require many connections, many TCP connections, each with their own overhead. The performance of each of these are very, very closely tied to your round trip time. HTTP 1, the data was defined as an ASCII character stream of text, so you can actually see it. It’s English, it’s the text that you can go by is ASCII. Because of this, we must send and receive the request in exactly the same order that sent them. When we send a request for image one, we have to wait for the server to respond with that data so we can allocate the data to that request in the browser until we can send the next one, on a single TCP connection. This phenomenon is known as “head of line blocking.” An analogy like this is you could be in a bank, you have two people in front of you waiting for the cashier, you are blocked by them and however long it may take to process them at the cashier until you can go. [00:08:21] We have exactly the same problem with TCP connections because the data is ASCII, we can’t interleave it because the things might get mixed up, the cashier might get confused, so we have to wait for that response before we can send the next response. Now, this is a fundamental flaw, well, some would say, in the design of HTTP 1 and more importantly, why we have a ripple of effects. To get around this issue, browsers started to open more than one TCP connection for the same host. It’s like, “I can’t send another request in this TCP connection so I’m going to have to open another one.” We started off by opening two. Then that wasn’t enough so now the specification says that we’re allowed to have six open TCP connections to the same host name to overcome the head line blocking, but this comes at a great cost to our users. Each connections incurs a full TCP handshake, if you’ve ever seen me talk before, I’ve explained in depth about that, but on a UK average 3g connection that could be up to a thousand milliseconds just to open the TCP connection. Then, if you’re over a secure network, that would mean the TLS handshake as well. Then each connection competes with the underlying network resources, that bandwidth and thus causing potential congestion on the underlying network link. [00:09:37] That wasn’t enough. We stopped at six and we’re like, “No, but I need to optimise this as much as possible”, so we started to create hacks and anti-patterns to overcome head of line blocking. One of which is concatenation. I’ve heard all the cools kids today like to use React, Angular, jQuery, Bootstrap, and Mo Tools all in the same application. In fact, Zeb just told me about my dear friend Olly who is actually using an architecture like that at the moment, he’s no longer my friends. Why have the overhead of creating five connections when I can concatenate that into a single file and therefore I only need to open one TCP connection for this. This is great and I still do this to a daily basis. Most of you will have built processes that are concatenating these files together, but this comes at a cost, right? That’s more CPU and memory overhead for your low-powered mobile device to download and parse all of that, even if they only need to actually execute two lines of Mo tools. You’ve just incurred the user from downloading all of those bites. If I change a single line, even add a semicolon to a line, and then invalidating all of those bites, even though I only changed one bite in that file, I’m forcing the user to download. Images actually came first before this. Images were the main reason why I actually started opening more connections because in the early 00s, they were the main medium that we were throwing loads more at. [00:11:00] We started to borrow ideas from gaming developers in the 80s and we started to sprite images together. Rather than having 200 HTTP requests over probably six TCP connections, I can concatenate that into one, sprite into one. The problem here is I probably only need to use one image on my page and so I’ve just forced the user to download 200 pixels, maybe even 2MB of data when all I wanted was to display that one image. Again, the invalidation problems here, my designer decides that this devil actually needs to be green, we’ve just forced the user to re-download all of those images and not benefiting of having a cache at all on the user’s device. Then we decided that six hosts weren’t even enough, right, and so we started to shard our TCP connections. Here, this is literally taken from Flickr a month ago. This is C1, C2, C3, C4 at staticflickrhost.com. This is actually all residing back to probably the same IP address of the same server even. This is called “domain sharding.” This is tricking the browser to thinking that these images are actually on different hosts so it will open up another six connections. Flickr has probably actually got 24 open TCP connections right now and this is coming at a cost. By creating those additional TCP connections, we’re actually completely flooding the underlying network and Etsy, I should have had a link to them, Etsy have done some amazing research here as to the threshold point at which you actually start to cause performance issues rather than gain them by domain sharding. [00:12:38] Finally, we started to realise that, actually, let’s not create a TCP connection at all and let’s inline that resource into our document. We can send it down in the initial one and it’s quite ironic because I’ve sat in this exact place talking about why this is such a great idea probably about two years ago and has been an evangelist of this and I’m not realising that it’s probably a very bad idea. When you encode an image, for instance, you’re actually increasing the size of that image by 33%, so you’re actually forcing the user to download more and, again, low-powered devices are going to have a very hard time and lots of memory is going to be consumed to convert this back to its native format. Hopefully you’ve learnt why latency matters, it’s all about latency. Where latency occurs, what head of line blocking is, what the most fundamental problem of H1 and the hacks and anti-patterns that we created because of head of line blocking. Now that we know that, let’s have a look at H2 to see how it’s overcome these problems and what specific bits of the design of the specification have made it so brilliant. [00:13:49] To start with, I thought this was quite good, this was taken from a great book, it’s very simple, it’s an online, free, open source book called HTTP2 Explained by Daniel Stenberg, he is one of the maintainers of cURL and he works at a networking team at Mozilla and he explained the problems or the manifest essentially that the HT biz working group had, the remits and the requirements they had when they were building the specification. That was obviously to be less sensitive to latency, fix the pipelining and head of line blocking problems that we just saw, eliminate the need to increase the number of connections in all those anti-patterns to each host. This is really important, to keep all existing interfaces, URI formats, and schemes. I can’t stress this enough, when we’re designing a new version of the spec, it would have been great, you know those periods inside URLs like dubdubdub.ft.com? I don’t like them, I want to turn that into a Tilda, dubdubdubtilda.ft – we would have broken the internet. We can’t change the scheme, the semantics, the methods that we use so that most restful APIs talk to each other we couldn’t change. [00:15:02] Most importantly, it had to be made within the ITS H biz working group. Google laid some amazing foundations with the SPDY research but it was for them and it was in a controlled environment. We wanted, like most good web specifications, to be developed in the open and let other people add to and ultimately make it better. To do this, they’ve outlined what I think are the six core features that make up H2, which is multiplexing, a binary data framing format, prioritisation, head of compression, flow control and server push. This is front end London, for front end developers I think the most important thing is to understand the multiplexing, prioritising, and server resource pushing. At the heart of all H2 are streams. Streams are essentially a virtual channel inside the TCP connection, which carry bi-directional messages. The streams are virtual, they don’t really exist, they contain messages which are complete sequences of frames and messages are the things that most closely map to a HTTP 1 request or response and then you have frames, the building blocks of it all. Frames are the data payloads and they’re binary, so we have frames, frames make up messages, messages are transformed within streams and a connection can have multiple streams. [00:16:28] As I said, this is the building block, which is a frame. Each frame has a type such as “I am a header framer or I am a data frame or I am a push frame”. All frames share a common nine bite header field, which is this. Regardless of the frame type, they’ll all have this same header field and the header field is really important because it declares its type, obviously, saying, “I’m a data frame”, it’s length in bites, because remember I said it was a binary format? The stream is belonging to, so a stream identifier and maybe its priority which lives inside the flags. The data within the frame is actually represented as binary. This is really important because it allows a frame to declare its length, “I am 20 bites long”, and this is the reason why we no longer have to open more TCP connections because it means that I can interleave frames on the same connection. If the client, say the browser, is consuming that connection and it doesn’t care about stream four, it looks at the length of this and it can just skip forward that many bites in the buffer and not care about them. It can interleave frames, put them in their own buffers and re-stitch them back on the other site. This is the most fundamental piece of information about H2. This is what allows it to overcome the latency problems by interleaving binary data frames on the wire. [00:17:54] The only downside to it being binary is that we can no longer inspect HTTP request of responses on the wire. You’re getting used to being in your dev tools to be able to see the text of the raw response. Unfortunately, that is going but there are some ways to get around that, I will explain later. Here we can see how closely it maps from a H1 request to a H2. We’ll have the header area and HTTP 1, that will probably make up a couple of header frames and then the payload, which will probably make up a couple of data frames. All communication is now performed on a single TCP connection and this is why it’s so much an improvement over latency. Here we can see the frames interleaved with each other, different streams on the single connections. Whilst this is going on, I can send header frames requests for stream six and stream five whilst I’m still getting data for stream two, stream three, stream four. This is true bi-directional multiplexing happening right here. This animation is also extremely long. Sorry. Let’s just wait for it to go. Cool. [00:19:06] Over the years, because of the inefficiencies of H1, browsers have had to overcome and create their own optimisations of how they optimise the requests that they send whilst they’re rendering a page. They do this by critical resource prioritisation. Most browsers will have this baked into them. As your browser parses the HTML document, it will find the resources. When it finds the resources, it’s going to push these into a queue, essentially, before it actually sends them off on the network. Now, most browsers are clever enough that even if they found image one first before they found main.css in the ordering of the source, they’ll prioritise the CSS file higher because they know they need that to be able to paint to the screen. Actually, what they’re doing here is they’re creating artificial latency that they’re purposefully delaying the request, even though they sent it. I kept going on about how latency is so important and they’re actually on purpose creating prioritisation here. Now, with H2, because it doesn’t matter the order that we send them in on a single connection, I can just fire off the requests as I find the, which is amazing. We’re already reduced that artificial latency that resource prioritisation incurs. The eagle eyed in the audience will see they’re okay but what if I did still send image one first, doesn’t that mean that I’m going to get image one back first as well? Obviously, it’s faster but this is where H2 dependency tree waiting and prioritisation comes in, and this is actually part of the specification. [00:20:42] Now, as the browser is finding those files, it will apply a waiting number which starts off with 16, for its base, 64, and then it can increment that. Here, you can see that main.css has got the highest wait. Inside main.css, when we’ve got it back, we found icon.css. Even if this is parsing, like we’ve only got half of it, we can tell via dependency tree that this is linked to this. You shouldn’t send me any frames for that until I’ve got all the frames for that. What’s even cleverer about this is that it’s truly dynamic. If you imagine that we have a tab for Google.com open and there’s a H2 connection for that, the user opens another tab for Google.com and focuses on that tab. Firstly, the great thing about H2 is that we can share the same connection, but suddenly all the resources for that tab become much more important than the ones that still might be on the wire here. In real-time, we can change the stream waiting via frames and tell the server, “Actually, now these ones have become much more important.” Another use case you could imagine, even on your website, a user hovers over a button, they click on the button, that opens up a carousel and the carousel is a big hero image. Suddenly, that hero image, its priority becomes much more important than any other data. We can communicate to this to the server via prioritisation. This is extremely, extremely important. [00:22:07] No longer do we have to try and trick the browser and hide resources from the pre-parser because we can actually get the browser to do this work for us. Introducing HTTP 1.1, so 0.9 didn’t have it, it allowed us to start adding metadata to our requests and responses via headers. We started to add a wealth of meta data. Because HTTP is stateless, there is no state maintained, unfortunately, we have to send this data, especially things like cookies, on every single request, even if the server has already seen that cookie. This is exactly how login systems; authentication works on most websites. I have to always send this cookie. To address this in the H2 specification, they invented H Pack, which is a compression algorithm specifically designed for HTTP. Now, there could be a whole talk just about H Pack because it’s really, really intelligent but I’ll just explain to you basically how the basics work. The first thing is that the client and the server maintain a static lookup table, so just like a database table on either side of the connection, on the client and the server. In the specification, there is detailed the most commonly occurring key value pairings for headers and the assigned ID number. Whenever you need to send that value, you actually don’t need to send the key or the value at all, you just send the ID because the server will know exactly that. The really intelligent bit is for the duration of that H2 connection, for that single connection, both the client and the server maintain a dynamic lookup table. The first time I send this cookie header, on the server it will be assigned a key of 64, the next time I want to send that same thing, I only need to send the key of 64 along with those because the server has already seen this. [00:24:02] Now, the problem here is this is actually maintaining state between a client and a server and that’s why things like CDNs have had quite a lot of trouble implementing H2. To send a new one, I just send that and it will get assigned that. Then all of this data is then Huffman encoded, which is the same compression algorithm that G-Zip uses to reduce that footprint even more. It’s incredibly, incredibly intelligent. If you’re into that kind of thing, go and check out the H Pack spec. Finally, before moving on, the final feature I wanted to discuss about H2 is server side resource pushing. As we saw earlier, I have to send a HTML request, my get request to my index file, wait for the server to process that, I then get that back, I start parsing the document, I find the CSS file and then I send a request for that. Now, that’s extremely inefficient because I’ve had to wait for the server to process that, even though as applications designers and developers we know that the next thing that the browser is going to request because of resource prioritisation is main.css. I always know that when index.html is requested, the next thing they’re going to request is that. What if we could intelligently tell the client that? With H2, I make that request whilst I’m processing that dynamic file, I can push the data for that main.css file even before the clients got any of the index file and before it would normally parse and find it. We’ve dramatically reduced the latency that it might take for that normal round trip. [00:25:35] Now, the eagle eyed in the audience would notice that the push promise frame, so this is a new frame type, just like the header and data frame. The push promise frame must be sent as part of the specification before any data frame of the initial resource. Why is this? Because you might get into a race condition. That if I had sent the data frame first, the client might be so fast that it would then find the CSS file and then you’re actually sending too many bites as possible. As part of the spec, push promise frame must be sent first before any data frames with that file. Now, the even more keen in the audience would have noticed that what if the client already had that main.css file in its cache? We now have a new frame type called “reset stream”, so this is the client saying, “No, okay, thanks for that. I don’t need any of the data frames for that stream because I already have it in my cache.” That’s even more efficient. We can use reset stream quite a lot. If, say, for instance, the cache headers are not stale, then you could reset maybe Ajax request during the connection’s lifetime. This is incredibly powerful. I can’t stress that so much. This is truly bi-directional multiplexing of H2 connections in full effect right here. [00:26:53] We’ve learnt the building blocks of H2, which are streams and frames and binary data framing, resource prioritisation, header compression via H Pack, and server-side resource pushing. Now, hopefully that wasn’t too much information for you to take in and I promise you that was the most techy, crazy bit of it all of this talk. I personally feel about this that the more I’ve learnt while working with H2 and reading the spec, the more I’ve become to be amazed by its design. It is truly incredible compared to H1. They’ve done some amazing work here and I’ve only literally just skimmed the surface. It’s a very large specification but I do urge you to go and have a check and read it a bit more, but hopefully I’ve given you enough there to take home and apply to your day-to-day work. The most positive thing I’m going to show you today is the current browser support landscape. We’ve got a global average of 70 and here in the UK we’ve actually got 77, which is amazing. Safari finally jumped on the bandwagon at 9 and that brought up the stats massively. This alone was enough for us at the FT to know that it was worthwhile investing in. [00:28:04] Out of interest, who here has deployed H2 into production? One person. Great. Two. Awesome. That’s good to see and that’s hopefully why I’m here, to try and get you all on the bandwagon. A much debated feature of H2 was its requirement for TLS, that you have to be serving a website over HTTPS. Probably, that might be a reason why most of you in the audience aren’t using it yet. The original SPDY specification, it actually was a requirement but they dropped that in the HTTP Biz working group and it no longer is. They’re got a lot of stick for that and I actually agree with them: one, because I think we should be making the world a more secure place, but two, in the SPDY experiments, they found that many of the old middle boxes and proxies – so the internet is actually made up of lots of cables and boxes that get eaten by sharks under the sea. If those proxies didn’t understand the packets that they were routing, they would drop them. In the SPDY experiments, they saw between ten and fifteen packet loss for all connections. That’s why they enforced TLS is by creating a secure tunnel, you’re ensuring that there are no middle boxes can be inspecting the packets. [00:29:21] Who here has heard of Let’s Encrypt? Awesome. That’s great. You have no reason for your site not to be delivered over HTTPS now where there was a free open certificate that makes it so easy for issuing and reassigning certs. Lilia, it would be really good if you could get someone to come and talk about Let’s Encrypt because I think more people need to know about it. Once you’ve got TLS setup, you start observing your own traffic. When we started H2, it’s obviously gone up a lot more, is what our stats were like. We started to talk to our stakeholders, explaining the benefits of HTTPS and H2 and what might happen to those other users. The server support is looking really good. Most of your probably serve your websites using engine X or Apache. They both support it now. This is actually about to change; I need to update that. Jetty if you’re using Java, IIS for Windows, the support is all there for the actual servers. Sadly, the CDN landscape is not looking that great but it’s getting better, I support it. Cloud Flare, no sign from AWS Cloud front yet, Vastly have done it. They’ve actually got the best implementation at the moment and I’m not being paid to say that even though I am a customer. It’s just that they’re chosen the best H2 server, something called H2O. Aclimide is just about to release push. [00:30:40] I heard we all like to deploy our software in the cloud these days. I don’t know why; it rains a lot up there. That was a terrible joke. The service providers, their internal networking stats are very bad, Google App engine are the only people that have native support for it. AWS have not mentioned anything on their ELBs yet because Heruko is on AWS, we’re not going to see anything there anytime soon, I don’t think. That’s great, you’ve upgraded your server software maybe, for some of you it’s as simple as that. How can I start using this? The approach that we first took at the FT, you probably already served your static assets, your images or your CSS off something else other than your application server. Why not just put a CDN or a proxy in front of those? You don’t need to worry about that because that’s probably going to take a lot longer to upgrade your origin. Or, stick a proxy in front of your origin that can speak H2. I know a lot of businesses are going for this approach, this is in fact how Vastly did it as part of their CDN work. Finally, hopefully in a couple of years all of our servers will be speaking it natively. [00:31:48] Once your H2 capable, it’s time to start considering how you’re going to optimise your resources and what you want to push. Here’s your typical critical rendering path, we have to request the HTML file, we then parse it, we get the CSS, we have to block, wait for that, then we get our fonts, we have to block and wait for that. Obviously, for me, the most important things that you should be pushing are your critical resources and look already the impact that has on the timeline, the amount of latency that we can be reducing. There’s been a lot of debate within the W3C of how as developers we declare the resources that we want to push. The most common way of doing this now is via the link header and using the REL preload attribute. This is me saying that style.css, I want you to push this and Apache, Engine X, and H2O have all taken this as their implementation of pushing resources. You can also do that via an element but that’s probably far too late, the client would have already found that, so you should do it as a header. Myself, and people like Yohan Weis, he wrote preload, think that actually we need our own semantics because the semantics of preload are slightly different to what you want to push. We’re pushing for a REL push specification. [00:33:06] Now you’ve got it working, you want to start seeing if it’s actually doing what you want. The webpage test which is the toolbox of most performance engineers now has native support for it. You can see multiplex connections within the connection view. Also, the Firefox agents on webpage tester, you can see pushed resources. It took me about three months to realise that in Chrome you have to right click and enable the protocol in network panel. It literally took me three months and that’s the only way in Chrome whether or not the connection is. Here you can see us serving our document over H1, all the resources over H2. Firefox put it where I would have thought it would have been, in the network connection in the headers. I mentioned earlier because it’s binary framing, we can no longer inspect the actual response body. Unfortunately, we’re going to have to start becoming more friends with tools like Wire Shop as front end developers. This is quite scary when I say this to a lot of people. I was as well. This is a really good blog post explaining how to set up Wire Shop properly and use the TLS encryption. This is great. Here you can see header frames on the wire and actually the data of that. It’s decrypted the binary and the TLS certificate and now I can actually see the data. [00:34:26] The only way to do that in dev tools at the moment is in Chrome, they have in Chrome net internals. You can actually inspect the open H2 session and here are the headers. It needs a lot of work, especially as more people are going to start to having to use net internals. It needs to much love. If you’ve implemented H2 and you want to know how much of the specification your server or CDN is living up to it, there’s a really good test suite H2 spec on GitHub that you can run against your deployment and see how good and how much of the spec you’ve implemented so far. Sorry for the whirlwind tour at the end there. I’ve already run over time. I told Lilia I’d only be 30 minutes. I’m already 35. Hopefully, in that section you’ve learned the browser support, the TLS requirement, considerations when you’re thinking about choosing a H2 server, what you should be pushing, how you can push it, and the current tooling landscape around it. [00:35:23] To end with, whilst HTTP 2 is here, a lot of you aren’t using it yet but there’s already some problems with it that people are starting to iron out. It’s only by us trying and feeding back to the working group and to the browser vendors will this become better. One of the biggest problems that you might have noticed with Push is that it still can be extremely efficient that I’m sending resources down to the browser even though it might actually already be in the browser’s cache. What is his name? Ken G, who wrote H2O server has come up with a cache digest specification and this is going to be a new frame type and this is a way of the client communicating with the server what resources it already has in the cache for that host name. That’s extremely powerful, not just for H2, think about how CDNs can use this – it’s amazing. It basically uses an algorithm to compress down all those values into a single digest and sends that up in a single frame on the H2 connection. It’s very powerful. I talked a lot about head of line blocking earlier and whilst we’ve eliminated head of line blocking, we’ve only done that at the request layer, actually, all we’ve done is push down the inefficiencies to the TCP layer. A lot of deployments, especially friends of mines as Vastly actually don’t like H2 because all they’ve done is pushed the problem lower down the stack. Google already realised this many years ago and have moved on. They have now opened the specification for Quick, which is H2 on top of a UDP connection, not at TCP connection. [00:37:05] When you’re browsing on Google.com today and many of the Google properties, you’re already not using SPDY or H2, you’ll be on a Quick connection. It took four years, as I showed the history line at the beginning, for H2 to become, so I’ll probably be back in about four or five years to talk about Quick and why it’s so great. Ultimately, it’s new, there’s a lot of unanswered questions: when should we start de-optimising assets? What’s best to push? How’s this going to play with servers? How does it affect my website? To leave on that note, I just wanted to share some of the findings that we’ve had that the FT and I think by only us talking and people sharing and people writing blog posts does the web evolve. We ran a split test when we first deployed HTTP 2, so on mobile, tablet, desktop, and compared the 95th percentile of the page load events and you’ll see that actually it doesn’t have that much of an impact on desktop but has a dramatic impact on mobile by nearly five seconds. I think this is amazing because it proves that it truly was designed to battle latency, and that is the problem that we have on mobile. We don’t have latency problems on desktop. Then we measured the round-trip time, so the time it takes for all of these average connections and you can see that the greater the round-trip time, the better the impact H2 had on the overall load event and H1 got slower and slower and slower. Finally, this is actually where desktop did, we started some experiments with Push and you can see here that we shaved a thousand milliseconds of our start render by pushing CSS files. I’ve been working in performance for quite a long time now and I’ve never had a single technique that’s been able to cut that amount of time off. It’s incredible, go and use it and come and chat to me afterwards. I’m so sorry I’ve gone over. Thank you very much. There’s more! The performance basics still matter, right? A slow website on H1 is still going to be a slow website on H2. You still need to think about the performance basics. Optimise your first render, compress, minify, optimise, reduce DNS lookups, use CDNs to reduce your latency and come to London Web Purse, we host it at the FT. I’m a host. Sorry I had to plug that. If you want to find out more about performance techniques, come to that. Thank you very much. That’s the end.