Writing Secure Node.js Code

Josh Emerson speaking at Front-End London in October, 2016
1536Views
 
Great talks, fired to your inbox 👌
No junk, no spam, just great talks. Unsubscribe any time.

About this talk

Some of the very things that make JavaScript awesome can also expose it to security risks. This talk will go through some sample security flaws unique to Node’s async nature and surrounding ecosystem (or especially relevant to it) and will show how these could occur in your own code or in npm dependencies.


Transcript


Hi, I'm Josh. This is me, on my...not even my first day at the company. They have these wands. If you see me afterwards, I've got a ridiculous number of them in my bag. They're quite fun for a little while. You can have a wand. [00:27] I'm not sure why wands in Snyk. Snyk is a company that does stuff around security, specifically known vulnerabilities in your code stack. I'm going to allude to something that happened the other day. Do we remember Friday? [00:41] [laughter] Josh: [00:41] I remember Friday. I'm still traumatized by the events that unfolded on Friday. We were also trying to keep our website up on Friday. It turned out that the cause of all of the Internet downage, on Friday, were these little devices. Actually let me just see if I can get speaker notes up. There we go. [01:02] A company called Zhongmao creates the components that go inside of various products, so this might be inside of you webcam and this might be in your DVR, which people have a lot of DVRs. They are normally connected to the Internet these days. They happen to have username passwords that can't be changed built into them on the hardware not software, so you can't change it. [01:30] They were perfectly right for being DDOS-ed. Basically all these Internet cater devices just started hitting DNS server provided by DynDNS, which provides most of the websites you know about and brought down large chunks of the Internet. Now we don't make hardware, do why is this relevant to us? We make software and software has dependencies too. [01:57] We don't write all the codes that executes when we deploy an application. Even an operating system or for instance the Node.js runtime, they're still dependencies because your applications are dependent on them. Any vulnerability that's on any level might potentially make you yourselves vulnerable. Known vulnerability is of the worst kind because attackers can write mass exploits as they did on Friday. [02:23] NPM is awesome. Open source is awesome. The idea of people sharing code is great because it means that we can share our work, learn from others, and reuse things that other people have built and focus on making the next great thing. If we look at the numbers, NPM is one of the most active package managers at the moment, with 350, 000 packages, six billion downloads per month, that's incredible. [laughs] [02:54] It also says something about the number of times we're downloading the same packages over and over again. 65,000 separate publishers, that's different accounts that are providing codes at Mad. Do you know which dependencies you have? [03:11] I would hope that you know which dependencies you have. You probably installed them at some point in time. The thing about dependencies -- and specifically, particularly prevalent in the world of NPM -- is that your dependencies will pull in dependencies. They have their dependencies. [03:27] This is the Express app. It's just loading all of its dependencies. This is an online viewer up there. It takes a little while to crawl. Every time it loads a package, it has to look at its package.json and put in all of the dependencies of that thing. [03:43] There's 42 nodes, and there's a whole bunch of faces up there. Those are the people who contributed code to these code basis. That's great. NPM is awesome, but every dependency is a security risk. You should ask the question, "For every single dependency, do you trust the people who wrote that code and contributed to it? Do you know if it underwent any security testing?" [04:12] I think nowadays everyone talks about testing your code. They talk about unit tests and integration tests. We want to make sure that the next time that we deploy that we don't break anything. We don't really spend enough time talking about security testing, and it's a whole another thing. It's often a lot harder to test to make sure that your code is secure. [04:32] The thing that we really want to know is, are there any known vulnerabilities in our code? Because the moment that vulnerability is known, especially if it's prevalent, people can start to try and exploit it. You'll probably see bots crawling the Internet trying to find old versions of WordPress that used to be vulnerable in some way, and they're still just trying to find that website that hasn't been updated yet. [04:54] 14 percent of the vulnerabilities of packages carry a known vulnerability. The company I work for Snyk, has found out that 76 percent of the people using our software have found vulnerabilities within their own apps. It's not small numbers. [05:16] I'm sure the questions on the tip of your tongue are, "How do you protect yourself? Can I learn anything from these vulnerabilities?" Those two things, actually. I'm going to switch to some Live Hacking. [05:29] Hacking, I use in a loose sense because I'm not suggesting that you should go and hack other people's websites. But this is more me hacking a simple website that we've created called Goof. This is not Goof. [chuckles] [05:47] This is Goof. Goof is that, "Hello world, to-do app." You can do things like type in, "Buy eggs." There you go. It's a very simple website. If we look at it here, so if I go and find the repository... [06:06] [pause] Josh: [06:16] we can see here that it's got eight known vulnerabilities within it. This is an app we've created to demonstrate a point. We're pulling in a whole bunch of vulnerabilities, some of which are there are fixes for all ready. You might not be as vulnerable as this app. This app is really here to demonstrate just how you could be vulnerable. [06:39] If I go to these websites and go the About page, I can see that this is the bestest to-do app ever. If I have a look at how that page is loaded in code, I can see that we're using a package called ST to basically load some static files. It works a little bit like the express static files module. [07:07] By default, when you load in this module, you will actually find that if you go up one directory, it actually shows you what's in that directory. You have to specifically turn off indexes if you want to. Out of the box, it won't do that for you. Now it says, "Not found." [07:30] This isn't a vulnerability because this is how the package works. This is you just need to read the small print kind of thing. You need to know that you're not leaving itself exposed because you're not using the package in the way you intended. [07:43] But what about if I go like that? What's going to happen if I try and go back up two directories from the public directory? Luckily for us, Chrome is just going to say, "OK, you meant the home page. I'll take you to that home page." It does it for us. But that's not really a fair test of what is really happening and what somebody could do if they were trying to exploit you. If I do... [08:20] [clicking sounds] Josh: [08:20] I think it's that one. If I do this code command, it's basically the equivalent of what I just tried, this time going up three directories. That's fine. We've got the same home page content, so we're not vulnerable. Great. [08:34] But if we have a look on the Snyk websites at the information about the ST vulnerability, we can see here something about send to E. Send to E is the URL encoded version of up one directory. It's a dot dot. [laughs] I'm going to do this. What do people think I'm about to see when I do this? Anyone? [09:04] [audience member speaks] Josh: [09:10] Hang on one sec. Did I do something wrong? [09:16] [audience member speaks] Josh: [09:16] Yeah. [chuckles] Has someone seen this talk before? [09:20] [laughter] Josh: [09:20] This one I tried just before I came here. I have this seen this "Not found." Hang on...here we go. This is my password directory. I just typed in "ETC password," and I've just opened the ETC password file on my computer. Luckily for me, I'm running Mac OSX where nothing too important is in this file. But if I was on a Linux machine, this would contain some pretty secret secrets. [09:53] You don't really want people accessing files on your hard drive. You don't want them accessing files on your server because you might have something like conflict files lying around there or something that you don't want the world to see. [10:05] This happens to be vulnerable. If we have a look here, the remediation, what we need to do is upgrade to a later version. The next version that was released fixed this issue. That's great. Ordinarily, if you were doing same for NPM out of the box, it would upgrade you the next time you did an NPM install. That's good. [10:24] The next one I'm going to look at is to do with a package called "Marked." Marked is a markdown parser. The way that it works, it tries to sanitize any HTML that's cast in so that you don't get cross-site scripting, which is where somebody can make some JavaScript execute on your page. [10:44] If you've got a form, if you allow any inputs, a bit like how Goof app does, somebody could actually put some JavaScript into your page that goes for every user to your site and maybe, I don't know, gets their cookies and so they can authenticate with them. Not good. [laughs] [10:56] If we try a few of these, I've got some examples. The first one is just to show you that the Goof app is allowing markdown just because. It might be useful if you wanted to, say, link to the Front-end London website. Cool. That link, if I click it, would take me to the Front-end London website. [11:25] If I type in that there, JavaScript to that one, what do you think is going to happen? We've got sanitized hand on. It sanitized it. That's good. If we tried this one, where we've tried to URL encode some of those or in fact I always get confused. I think that's HTML entity encoding, but yeah. It caught as well, good. So far, so good. [11:59] There is a really interesting thing. If we do that same one again, but we go along to the percent-five-eight and put in something that's defined in the scope such as this, JavaScript, this isn't a valid HTML entity anymore, but browsers are pretty tolerant for weirdness in the markup. [12:26] The markdown parser no longer sees that as valid JavaScript. Yet, the browser itself tries to coerce things into what it thinks you're trying to say, rather than what you actually did say. Now, if we click this, we get some JavaScript executing on the page. [12:43] Just to show you, yeah, this was the one. You could do a lot more, but now you've got access to that computer, basically. You can now execute JavaScript or anything that's happening inside of that session you've got access to. [13:06] The final exploit I want to show you today is to do with a Mongoose, a MongoDB. In order to access MongoDB, which is running on this server, that's where all the data gets stored, we're using the Mongoose library, which is the most common libraries to use if you want to talk to Mongo from note. [13:31] One of the issues that it has is a potential memory disclosure. Now the buffer in JavaScript is quite interesting because there's two ways to initialize a buffer. A buffer, by the way, is normally used for binary data, but you can use it for strings on any kind of data. It allows you to string data. If you've got large amounts of data, it's much more efficient in memory usage to be able to string the data. [13:59] You can initialize it in one or two ways. You can either initialize it with a string or a number, a string or other values as well, like binary. If you initialize it with a number, something fashionable happens because it will give you an uninitialized buffer of length specified. [14:20] What it will return to you is that uninitialized buffer. Whatever is in memory is what you'll get back. It's not the memory that you just allocated because it's uninitialized. It will be whatever was previously in memory and hasn't yet been overwritten. [14:34] I can show you this with a few code commands. In fact, we're going to use a library called http rt -- it's a Ruby library -- just because it does prettier output for the headers and things like that. That's our home page. We're just showing the headers there. [15:00] In fact, sorry, I'm going to start in the browser and just show you that when we added the last bit of data to DOAP, it sent a post request to the create endpoint. The content would be whatever we put into that form. Very, very simple for the purposes of demonstrating. [15:23] We're going to do the same thing here, if I can type correctly. We're going to say Biber. This should just be the same as us using the web page. There we go, "Content equals Biber." Now if you refresh the page, we've added a line item. [15:45] Go away, doc. That's not very helpful. If we do a JSON object, because we're running an express server, by default...No, not by default. We have turned it on. It will allow us to do a body with a JSON format. [16:06] This will be handled just the same way. Now we've just put in 800, but you'll notice there that I wrapped quote marks around the 800. It's a string. What do you think happens if I take the quote marks off this 800? [16:26] We're now sending a number. What we get back is some memory. It doesn't read very well in the terminal, but I'm going to do it a few times for a reason because the way that memory works, we're getting random bits of memory. I have no idea what I'm going to get back, which is always a bit fun. [16:47] If we refresh this page, we can see a lot of gobbledygook. If I view source on this page now, we can see a buffer. That's not surprising because we've been using some buffers recently. Sometimes you see objects...As I say, it's a bit hit and miss. Sometimes we'll have to run this a second time to see anything exciting. There we go. There's a prototype, undefined, find undefined. [laughs] [17:15] It's entirely possible. It's very likely if someone was running this for long enough that they would start to see some secrets. They would see either someone's password that they just put in and was traveling on its way to somewhere, or they would see, I don't know, some config items again that you were sending about even if they're just stored in memory. [17:37] It doesn't have to be on the file system now. We're looking at the memory of the system. Anything that's loaded into your node server is now potentially available for some hacker. [17:47] What do you do about all this? It's all very doom and gloom, so far. Knowing about vulnerability is obviously the first thing. As soon as vulnerability is known, normally most active package maintainers will be writing some sort of fix for it. And most of these do have an upgrade path. [18:05] But you might find that there is a period of time for which you are vulnerable and there is no upgrade path. The buffer one, by the way, is very similar to the Hearthbleed SSL attack that happened and that kind of a [18:17] Affected everyone because of the nature of SSL and that we were all dependent on SSL. [18:24] And they fixed that one pretty quickly because of the widespread nature of it, but it did require each server to update itself. If you're in here you can actually open a FixPR which is a very handy little plug. [laughs] What this is going to do is open a pull-request with fixes for the ones that are able to be fixed and patches for the ones that can't be just upgraded. [18:54] Once we've opened that FixPR it's going to take us to GitHub. It takes a little while sometimes. There we go! And it tells you exactly what it's going to upgrade and what it's going to patch. [19:06] And then if we have a look at the files changed we can see the base here is updating the version numbers on a bunch of dependencies and it's running snyk protect which will apply the patches. And the patches are just a very small diff trying to do the minimal change to make it not vulnerable any more. Just finishing up, don't start hacking sites. As I said before, this is just for demonstration purposes only. [19:34] We've learned how do you protect yourself. You want to address known vulns in your dependencies by finding them, fixing them with either upgrades or patches. Also, importantly, preventing additional vulnerable packages. It's good to be constantly alerted if somebody's going to make a pull request, for example, that will contain a new vulnerability. [19:57] You want to know about that. And also responding quickly to new vulnerabilities because it's not enough to do an audit today and say "I'm vulnerability-free today" because tomorrow a new vulnerability might be discovered in a package you're already using. [20:11] What did we learn? Consider URL encodings. And I think that's harder than anyone at first thinks, because as soon you're using something to mark down, there's a million ways that somebody can write malicious code. HTML and URL encodings seem to be prevalent, and we do see a lot of patterns here, a lot of similarities. [20:34] If you can, white-list instead of black-list. So say "we only accept a very small subset of things." It's very hard to go around and say each individual thing you don't allow. [20:42] Beware of JSON type manipulation. JSON is really nice in that you can accept a lot of different formats, but these things come straight into your servers. What did you really want someone to be able to submit here? Did you want a number? And don't initialize a buffer with integers. Do not do that. NMP is awesome. Please enjoy responsibly. [laughs] Thank you.