!= Image Processing and Manipulation in Node.js - Sessions by Pusher

Image Processing and Manipulation in Node.js

Eyal Arubas speaking at JS Monthly London in June, 2016
Great talks, fired to your inbox 👌
No junk, no spam, just great talks. Unsubscribe any time.

About this talk

In this talk, Eyal Arubas discusses how his dissatisfaction with the image processing capabilities of Node.js drove him to develop his own solution to a common problem. He shows us how he went about building such a library, the challenges he faced and why the alternatives just don’t quite cut it.


Eyal, it’s fine. I’m going to talk today about image processing in Node.js. I know it’s not a very natural subject for JavaScript. Just raise your hand anybody who has done here image processing in JavaScript before. Cool. What did you use to do that? Watson. Watson. I don’t know what Watson is, but all right. Yes, what did you use? Clarify. Okay, yes. Did anybody here use an NPM module to do image processing? Yes? Image Magic. [00:00:38] Image Magic. Very good. All right. Image Magic, cool, cool. I’m going to basically present here my module. It’s not just about self-advertisement, I’m going to also show you why I developed this module and why Image Magic was not good enough for me, basically. About two years ago, I switched from PC to Mac. I bought my first Mac, which is this one here, and I was looking for a nice image viewer and I didn’t really want to pay for anything. It’s not the open source thing, to pay for stuff. I couldn’t find any good Mac program to do imaging. I thought it would be a nice opportunity to develop one myself. Back then, there was something called, “Node WebKit”, does anybody know what it is? All right. It’s basically a way to build desktop applications in JavaScript and there is similar project today called, “Electron.” If anybody has worked with the GitHub ID, Adam, it’s built on the same technology, basically. It’s a way to build desktop applications with JavaScript and that’s what I started to do. I thought I’d build an image viewer; how hard can it be? You just need to read an image from the file system and display it to the user. That’s what I did. It worked out fine for viewing images. Basically, you just need to read the file and then render it on some Canvas Element. Right? The next step would be to actually edit that image. For example, resize or crop the image or whatever. That’s where I got a little bit stuck. I used the developer’s best friend, Google. I asked Google how to do image resize in Node.js and the first thing Google gave me was “GM”. The guys here who used Image Magic before, they probably know what it is. It’s basically an NPM module to do image processing, so let’s do a small example. [00:02:46] Can you hear me without the microphone by the way? Yes? Cool. I don’t have internet here but I already installed GM before. I’m going to show you what happens when I try to use it. I fired up Node. Can everybody see the screen properly? Do you need it bigger? No? It seems to be fine, right? Let’s require GM. There we go. We have an NPM module here. The way it works is you give it an image and I have an image in the same folder I ran this program for. Then you specify some operation. For example, resize the image. Then you write the image to disk. Let’s call it “output.jpeg” and you give a call-back function to know if there was some error. Let’s bring the error if it’s there. Okay, so it could not execute Graphic Magic, Image Magic, whatever. Google gave me a start but then I realise I actually have to read the readme of this thing. I go to that we saw and that’s the readme. The first thing on the readme is “getting started” and they say I need to install Image Magic. Remember, I was new to Mac back then and they said I need to use something called, “Brew” or “Home Brew”. I had no idea what it is, right? It was basically my first day using a Mac. I said, “No, no, no. That’s too complicated. Let’s go back to Google.” [00:04:30] On the third result of Google, stack overflow to the rescue. How to do image resizing without Image Magic. Perfect. That’s exactly what I need. I went there and begged them, the question was empty, there was no answer, basically. A guy asked a question and nobody answered it. The question that you see here is actually me. That’s my name and I answered about one month after I saw the question and that was the first version of my module. I thought it’s impossible that nobody has done anything like this before. Just for the background, my problem with Image Magic was that you have to install external dependencies on your system before you can actually use the module. It’s not enough to do NPM install GM, you actually have to make sure things are installed on your system before you start using it. If you use a Mac, that’s great, you have instructions on the readme, just do Brew, install Image Magic and you’re done. What if you use Windows or Linux? God forbid. That’s why I thought there must be a better module and an easier way to work. I went where all developers go when they’re seriously stuck and that’s the Node.js mailing list. We started a discussion there, what to do in this situation where I don’t want to have any external dependencies on my system. I just want a pure Node.js module to do image processing. Also, they told me there is nothing like that right now. [00:06:06] I saw there is a need. An unanswered question on Stack Overflow, a discussion on the Node.js mailing list. I figured it’s a good opportunity to start working. Before I continue, I want to show you an example of how it works and then we can go a little bit deeper into how I actually built the module. Okay. Now, I don’t have internet working here so I can’t show you the actual installation process from NPM, but you have to trust me that all you have to do is NPM install NWIP, that’s the name of the module, then it runs some NPM commands, compiles some things and you’re good to go. You don’t have to install anything else afterwards. The way it works, basically, is you open an image and in the call-back you get an error object and an image object. Let’s put the image object as a global variable. Okay, so we have an image object. Now, let’s say we want to rotate the image. We do image.rotate and we want to rotate it by 30 degrees and have a yellow background while rotating it. Again, you have a call-back with an error and an image. Now we rotated the image and we want to save it to disk, so let’s do image.writefile and call it “output.jpeg”. We’re done. Let’s check it out. Now, we have output.jpeg here. I’ll open the original image. It looks like this. Does anybody know that image, by the way? Yes. [00:08:15] All right, so that’s the standard image processing, image example that everybody gives everywhere. There is a story behind that image, which maybe I’ll tell you later in the bar, but let’s see the output first. That’s the original image and the output we expected to be rotated with a yellow background. There we go. Right? It works. We didn’t need to install any external dependencies to do that, we just needed to do NPM install of the module and we have an image processor in Node.js. Now I’m going to show you how it was built and some of the challenges involved. If you have questions, just stop me and I will then answer as I talk. You don’t have to wait until the end to ask any questions. Any questions so far? Cool. An image – I’m not going to go too deeply into the theory of images, all right, just very simple stuff – an image is just basically a collection of pixels. It’s a huge array. A 10MP image will have roughly 30 million values, because each pixel is represented by three values: red, green and blue. Right. Those are just integers on this, but we don’t deal with just images as arrays, we use image codex. We need them to store images efficiently on disk or send them over the network because if we had to actually send 30MB of data, it would be too much, so we use codex to compress the images in different ways. Different codex work in different ways. Codex exists for about 30 years, I think. The GIF codec, I say “GIF” some people will say “GIF”, I’m not sure what you like to say but the GIF codec exists, I think, from 1989. A long time now. All those codex are written in C++, not in JavaScript. That’s how an image probably looks like in memory, if we could actually see it. Just an array of pixels and codex take this image and do something with it. Some codex are lossless, like PNG for example. They just take the pixels and they compress it, just like Zip is doing and PNG and Zip actually work with similar technologies. GIF is also lossless, but JPEG is a lossy codec, which means it loses some data in favour of saving space. [00:11:01] What’s the point of all this? I want to show you what it takes to develop an image processor or at least software that can handle images. We talked about codex and we talked about array of pixels in memory and now we need to tie it all together. I haven’t even told you yet how it all relates to JavaScript, right? The general process is this, you take an encoded JPEG image, you need to decode it using the JPEG codec and then you have a big array of pixels in memory and then you want to do some transformations on this array. For example, you want to rotate this array or resize it or blur it or whatever. All of those are mathematical transformations you do and then you get a new array of pixels and then if you want to save that to disk, you need to use the codec again to encode this array of pixels into an image. The most popular codex right now are JPEG, PNG, and GIF. Those are screenshots from the webpages of those respected codex and as you can probably guess, coding developers are not really good web developers; otherwise, those pages would look much better than they are. They are very good at writing C++ code and they have been doing this for a long time. We said about 30 years now. My point is that those codex are well tested, they’re used everywhere in the world. Almost every project that needs to use GIF images, uses the official GIF codec and they are all written in C++. That’s a little bit of a problem because writing an image processor in JavaScript, now you need to interface with C++ code. You need to learn how to do that. That’s not really trivial. [00:12:54] I’m sure you can, for example, say, “I’m going to write the entire JPEG codec in JavaScript from scratch” but that might not be a very good idea. I know some people are trying to do that, but it’s really hard. First of all, because JPEG is a very complex codec and because the fact that JPEG exists for so many years in the C++ versions means that it’s already tested and people already use it everywhere. Why not reuse the same thing? The next step of developing my module was to understand how to actually write native code to interface with Node.js. That’s the main point of this module. It’s not like the modules that use Image Magic, which basically tells you, “Install this executable on your system” and then the JavaScript code just spawns an external process and uses Image Magic to do the processing. My module actually compiles native code and interfaces with JavaScript by exposing the native code, which is written in C++ as JavaScript method. First, we need to take all those native codex, JPEG, GIF, PNG, whatever, in their C++ version and implement on top of that ways to manipulate images, like resize the image or crop the image or whatever we want to do. All of this done in the C++ side. We want to expose those functions to JavaScript. By raise of hand, did anybody hear about what V8 is? The V8 engine? Cool. I need to really briefly just explain. It’s the JavaScript engine behind Node.js, right? Every line of JavaScript that you write in Node goes to the V8 engine, it interprets it, and runs the logic behind your program. The nice thing about the V8 engine – and other JavaScript engines as well – is that they give you a way to write native code and expose it to the JavaScript developer. [00:15:00] The thing is since Node.js first released, there were so many versions of V8 and so many ways to interface with it that it became really complicated. Now, if you want to expose native code to V8, you need to make sure it’s backwards compatible with every version of Node.js and Node 0.10 for example, which has been a few years now. Luckily for us, there is a really nice project called, “Native Abstractions for Node.” You can look it up, it’s on GitHub, and it’s a bunch of smart people which basically wrote C++ header files, which contain macros and functions and templates and classes that encapsulate the differences between the different versions of V8 and allow us to write native code for JavaScript, for Node.js, in a very convenient way. Let me show you really quickly how we can develop one of those native modules. Let’s say we want to create a very simple native math module. We want to use it like this: native math, it’s an object which contains math methods. For example, the sum method. We want to run it like this, just give it two numbers and it would return the sum of those methods. The point is that we want to write C++ code to actually do this and expose this functionality to JavaScript. If we were to write a C++ function to do this – did anybody here write C++ code in the past year, for example? All right, cool. It shouldn’t be really complicated. It’s basically very familiar syntax. Let’s create a new file and call it “nativemath.cpp”. All right. Okay. We want to define a new function called “sum”. Let’s call it “sum native” and it takes two arguments, A and B, and it simply returns the sum of them. Now, if we were inside JavaScript, ideally, we would want some way to interface with this new function which we just wrote and remember, this is a C++ function. It’s complied and run natively later by our specific machine and we want some way to access this function on JavaScript. If we were writing JavaScript code, the next thing we would do is to define a new JavaScript function, let’s call it, “Sum.js”, which receives two arguments, A and B, and uses the sum native function that we wrote before as the result. [00:18:08] This is ideally how we would dry this glue between C++ and JavaScript. Unfortunately, this would not work. As a node.js module we would just do module.export.sum equals sum.js. Let’s get this out for now. Just to save some time, I already wrote the glue code for this, so we don’t have to spend too much time writing it from scratch. It looks like this. We have the native sum and we have some ways to build JavaScript functions from within the C++ code. This is exactly how we do this. We don’t have to take too much time to read this but basically what this means is that we create a JavaScript function from the C++ code by interfacing with the V8 engine. Those lines that you see here are basically creating this new function and it’s equivalent to what you would write as regular JavaScript code and if anybody is interested, I can show you in depth later how it works exactly. Those lines over here are the part where we export the module as a node.js module. Again, I don’t want to spend too much time reading it but just to show you it works, let’s compile this. This went now through a compilation step and the output of this compilation step is that we have a build folder and a release folder and inside here, we have a new file called, “Nativemath.node”, which is basically just aesthetic library. Sorry, dynamic library. What Windows users would call a “DLL”, this is basically it. This is file which we can require directly from Node.js. Let’s try that. [00:20:06] Let’s fire up Node and let’s acquire this file. We have native math module. We take it from the build release folder and it’s called nativemath.node. here we are acquiring a natively compiled dynamic library, which we attach to the node process and then we have the C++ function we wrote before exposed to the JavaScript developer. Indeed, we have native math. It has a sum function and we can use it just like any other JavaScript function. It gives us the answer. Right? The major difference here is that this function actually doesn’t run through the JavaScript interpreter, through V8, but it runs through natively compiled C++ code. There might be some unexpected results. For example, if it was a JavaScript function, and I do hear 2.1 instead, you would anticipate that the return value would be 3.1, right? C++ works with types. We did a type casting of the second argument to an integer and indeed the result is 3. It’s not actually run from the JavaScript interpreter; it’s run from the C++ side. [00:21:29] That was a very short example, an exercise of how the module actually works. In this short exercise, we saw how we can interface with C++ code and how we can expose this functionality, which we write in C++ to the JavaScript developer. You might ask yourself, “Why is it ever important?” Like we said before, writing image codex, like JPEG and PNG in JavaScript is very complicated, so we want to reuse the already written C++ libraries. Also, the algorithm to manipulate images, like resize, crop the images, whatever. We already have very good C++ implementations. I’m not saying it’s impossible to do it in JavaScript, I’m just saying it’s already done in C++ and written to C++ and it would take a long time to actually do this in JavaScript. That’s basically what I did in my module. I wrote a native module for Node.js, which takes those codex we saw before, uses them natively but exposes their functionality to the JavaScript user in a convenient API. This is how it looks like. We already did this example before. Basically, it takes an image, opens this image, rotates it 45 degrees and then writes it back to disk. What’s happening in the background? It’s like a ping pong between the Node.js side and the C++ side. When we open the image, it goes back to the C++ side, reads the files from disk, loads it into memory, decodes the pixel by using the appropriate codec, JPEG in that case and then stores the pixels in a memory buffer and brings it back into the JavaScript side as an image object. Next, when we actually want to rotate the image, again, goes back into the C++ side, accesses the memory buffer, manipulates the image in however way we want to manipulate it and then brings this image back to the JavaScript side. [00:23:26] Finally, when we’re writing to disk, again, the same thing, just encode the image back, the pixels back into JPEG format and write it back to disk. That’s the result. I chose not to use Lana, just a nice kitty in this case, but it works. What I’m hoping to do next with this module? It seems like there’s a lot of demand for it. You can check out the GitHub repo later and there is lots of demand for new features and people submit poll requests and things like that. First of all, I’m hoping for more people to participate and maybe try to take part in it. Some of the things I want to do is even explore the option to implement some of the things in pure JavaScript and not even do C++ code anymore and make it even more lightweight. You can check out the GitHub repo later and see what other things people ask for. That was the presentation. I hope you enjoyed it.