Last week we added a seemingly simple feature to Ready Room; the ability to archive a single request. This feature enables a user to download a zip file containing all the files attached to a request, and would indeed have been simple to implement—if we could guarantee that a request would never have more than a small number of small files. Alas, we have no such guarantee. There could be hundreds of files, each hundreds of megabytes in size. Moreover, a hundred users could decide to generate an archive simultaneously.
Servers have a finite amount of RAM. If we tried to zip up 100 100MB files, we would quickly exhaust all available memory, crash the application process (which would restart automatically, because that’s how we roll), and never deliver the user’s download. This would likely cause the user to click the “Export” button again, which, of course, would have the same result: chew up a bunch of RAM then crash. These are the kinds of thoughts that keep programmers up at night (and gainfully employed).
So these are our actual requirements:
Suddenly, things are not so simple.
But you’ve seen this problem before. It’s the same problem Netflix has when you start watching The Crown. If Netflix had to send you an entire episode before you could start watching, three things would happen: One, you would need to wait an interminable amount of time before your program starts; two, your smart TV/Roku/laptop would run out of memory and crash; and three, you would cancel your Netflix subscription.
Instead, what Netflix does is stream its programming to you. That is, it takes the first few bytes of a program, puts them on the wire, and then ejects them from RAM. The Netflix client (e.g., your TV) receives these bytes, displays a frame or three of the episode, and then it too throws the bytes away. Meanwhile, the server grabs another small chunk of The Crown and sends it to the client, which displays another few frames, and so on. This continues until Charles divorces Diana.
Our archive feature is going to require a very similar solution.
As you may recall, Ready Room uses Google Cloud Storage to securely store files. These files are accessible via HTTP. What you may not know is that HTTP 1.1 supports chunked transfer encoding, meaning you can ask a Web Server to send you just a piece of a file at a time. Furthermore, the Elixir library that Ready Room uses for HTTP access, HTTPoison, also supports chunked transfers.
Now we need a way to take each chunk of the file and start generating a zip archive in memory. Furthermore, we need to start sending this zip file to the user even as we’re still building it. Fortunately, there’s a library available that allows us to create an archive from a stream of data. And of course there’s a way for the Phoenix Framework, which embeds a web server and on which Ready Room is built, to chunk responses back to the user.
If you were to play all that backwards, it goes like this:
This mechanism of having each component in the stream request a chunk of data from the component behind it has the added benefit of eliminating back pressure. That is, at no point in the process do we need to buffer (and potentially lose!) data because it is arriving too fast. When using streams, data arrives at each step only when requested.
And here’s the best part. Not only does the download start immediately, it uses next to no RAM. To the right is a picture of memory consumption before, during, and after zipping up a gigabyte of files (we’re interested in the blue line). The request was made at 18:03 and completed two minutes later. During that time, memory consumption “jumped” from an already ridiculously low 198.2MB to just 202.8MB; an increase of just 4.6MB! Which, if you’re not familiar with the art, is a value approaching zero.
We have addressed our requirements.