Let's Shake Some Trees

Deepan Aiyasamy speaking at Ember London in February, 2017
82Views
 
Great talks, fired to your inbox 👌
No junk, no spam, just great talks. Unsubscribe any time.

About this talk

In this talk, Deepan Aiyasamy shows how the team at British Gas are using recast with Ember CLI. Deepan also takes some time to explain abstract syntax trees, and how Ember's commitment to backwards compatibility has influenced the way British Gas introduce changes to their code.


Transcript


Right. So, a quick background. British Gas, we started using Ember a couple of years now, been using it for a couple of years, we started with a single application that allowed our customers to get a quote for their energy products. Obviously since then we've been expanding, we've got about six applications in production at the moment. So these are like a CodeApps. We have the online account management app, we have the code application, we have a home application and we have a services code and sale application. So, basically, all of our code applications, custom code, customer-facing applications are in Ember at the moment. So, one of the things we realized as we started developing these applications is that, the more applications you have, you building a backlog of concerns that are common to multiple applications like services that are reused across multiple applications. Models that are reused across multiple applications. So you soon start discovering that there's a set of common concerns between different applications. And because you don't want people to like, duplicate the same thing over and over again, you really want to put them all in a place where different applications can share those common concerns. So, what we did a year and a half ago, was to create an Ember add-on called Commons. So, what Commons did was, it was actually an Ember add-on, but it's a private chat add-on that we use internally at British Gas so Ember Commons kind of like, provided the different applications with reused entities, basically. So, it gave the applications with models, services, confidence, helpers. So, basically, anything that was like, we thought was of common concern across multiple applications. So, the Commons add on is an add-on that provided multiple apps with different reusable entities. Over time, the problem with Commons is that it started getting bigger and bigger because the more models you had, the more services you had, the more common confidence you build. Shove them into this particular add-on. It meant that apps that were light-weight, like we have an app called Retention that is like a single page, it's like ethically just a single root Ember application, it's pretty light. While some of our other apps have quite large-scale. The problem with this approach is that those smaller applications, they still had to rely on Commons if they wanted access to a couple of services or a couple of models, but the problem is because this is so fat, and there was no, like, way to only provide applications with the models they needed, they effectively inherited all of what was there in Commons. So, like for example, we have about 40 models in our Commons application, so if the Retention app, the app I was talking about, it used only a couple of models. But because it relied on Ember Commons for that because we didn't want to duplicate the code that was in the models, it actually inherited everything. All the services, components, helpers that it didn't actually use. So it started to be an example of a badly designed inheritance system. All right. So, I have to recreate the conversation we had this week around inheritance, yeah. So inheritance is pure evil. So, this week we had a discussion internally about inheritance versus composition, yeah. It's a very nice topic. The discussion came about from a discussion about when do you use extend in Ember, and when do you use mix ins in Ember. So you could use mix ins to add behavior to an Ember object, you could extend an Ember object from another Ember object, they both give you the same behavior, but one is typically inheritance, while the other can be classed as composition. So, inheritance was with this composition is like, quite a hotly debated topic in the community, especially like in the Java community, where you have classical inheritance and people talk a lot about why deeply nested inheritance structures are bad. The reason deeply nested inheritance is bad is because of something like that. So, the reason this picture holds context is that usually the way badly designed inheritance systems are described is that you ask for a banana, but you get a gorilla holding the banana in the forest, yeah. So, all you want is a simple service, because that service inherited from like, maybe, 10 classes, you get like 100 of them that you don't actually use or you don't need. So, it's a common problem in inheritance systems, and it's a very deeply nested inheritance systems where you have a really broad and deep inheritance hierarchy. You can find that those type of systems are actually quite hard to scale or if you do scale what you can in the present situation like that, where you just want some simple behavior, but because that behavior is in compassionate class, that inherits from like, 30 classes, you get old behavior you don't need. Which is why people like advocate using composition where you extract specific behaviors and then mix them into your object. So, you don't actually have deeply inherited classes, you just have the required behavior to your classes. So, that's one way to describe badly designed inheritance systems because I was going to present this at the end, but the way you could describe is that you wanted a half-pint, but you got a stein of beer. Although, you wouldn't mind that. So, this kind of is the background to why had to do what we had to do. The context for the inheritance thing, is that our Commons was starting to look more and more like an inheritance system that was like that, like different apps needed only specific assets, but they were getting massive number of entities built into them, just because the added Commons package which, other apps, for example, the dashboard app uses about like 70% to 80% of what is there in Commons that makes sense for it. But smaller apps that only require a few assets, it was quickly starting to become an overhead, it obviously impacts performance because app is downloading stuff it doesn't need, and things like that. So, we decided to go around the route of finding a way to only provide apps with the models they need and interface to other apps to do that. So, we needed to shake some trees, basically. So there are two ways to do it. The first way is that you do "I'll find out what you need" so you have a program that goes through your code finds out what models you use in your code and then only gives you the models that it identifies. The second one is simpler, "You tell the code what you need" and then the code just takes care of filtering the tree. So, the second bit is obviously easier to implement as a starting point, it can get messy after awhile because if you have like, 10,000 differences to add in your inclusions list, it's not great. So, at some point, you might want to consider that option or you might want to consider a mixture of those options, but an easy place to start is option two is "you tell me what you need." Actually, the second option is not that difficult to implement, now. It was actually, if you look at some add-ons they already do it. So, for example, if you've used Ember composable helpers, which is one of the add-ons, I think from DocEd, I think it was previously. So, from Lauren Tan, so, there you can actually specify the helpers you need. So, it comes with a set of helpers, but you can actually specify which helpers you want or you can either chose to have a blacklist or you can choose to have a whitelist. So, although you have 100 helpers, you can say you only want three of them or you want 97 of them and you don't want just three. So, you can do it both ways. So, it gives you the option. Here you are basically telling the code which models you want and all the code is doing is taking that list and only your app with those helpers. So, it's a good place to start this, if you're into it, and if you don't have too many dependencies to list in your file. So, we decided to go down the second part, the part similar to composable helpers as a starting point for us to see how this thing works. So, what are the tools you need? Embassy Light, obviously. You need Broccoli Funnel, I'll talk a bit about Broccoli Funnel soon, and then you had Recast. I'll give you some context on Recast later on, and what it is and what it does, but for the first bit where you just want to have a whitelist of models, so, you say "include three models." You actually don't need anything other than Broccoli Funnel, which is like a Broccoli plug-in. It's very easy to integrate into your Embassy Light bill process. So, Broccoli obviously works with Trees, which is like eidetically structured. So, what do you to Broccoli Funnel is, you provide a tree, and as a second argument you provide a function, you can provide a RegEx or whatever that allows you to basically filter the tree that, they'll filter the tree that you build into your app. So, if you, I'm sure, as most of you will be aware, what the Embassy Light bill process, what it does is, for add-ons especially, add-ons give a tree to the application. So, there's a hook called Tree for add-on if you've heard of that. So, the Tree for add-on is a Embassy Light Bill hook that allows you to customize what the add-on provides to your application. So, what you can do within the Tree for add-on method is actually create a funnel like that, and then within that method you can just choose to exclude or include the models that you specified in your bill file. So, that part of it is very easy to do. You can do basic tree shaking now with that type of approach. It works for a lot of cases. People try to go the complex way and identify elements proactively and things like that. You can do that, but I've found that you always start with, starting with the simplest possible solution is always a good idea to get your hands wet and see how the thing works for you. So, Broccoli Funnel, is the first one we used. The first parse is like, this is the first solution we came up with. So what we decided to do was from Commons because we had model, services components. The first step we do was to just deal with models alone. So, we left services, components as they are, we just decided to look into what we can do with models. So, we created a separate add-on called Digi-models which held all of our models. The first solution was what we needed to do was, the thing with models, is that models can also contain relationships. Especially if you're using Ember data models, they can also contain relationship to other models. So, they're like another variable in this, so you need to think about you do with your relationships. So, if somebody says they want a model, what do you do about the model's relationships? Do you automatically include all of the model's relationships with it? Do you ask the user to specify which relationships they want as well as part of your includes. So, there are a couple of ways you can like, approach the problem. So, this is the first parse. The first parse was our very first attempt at this, is where, say this was the model you had, and then within your included array you specified that you wanted to include model one. So, the output you would actually get was, it will give you model one, plus it would also give you all the related models within that model. In order to do this, what we had was, we had a script that actually passed the model file, identified what the related models are, and then it would also include those models for you. It is okay as a solution, the problem with this is that, so your inheritance tree is large, yeah, so depending on which level model you include, you might get more models than you need, actually. So, for example, if you included a root model, you would get all of the models because model one would have model two and model three related, model three might have related models. So, if you follow the chain, you basically get all the models. So, depending on which area of the tree you specify, you might still end up with a lot of models than you actually need. So, it gets rid of the forest, but you still have the gorilla, basically. So, that was the first pass. Worked okay for a few apps, but we found that the larger apps where still inheriting like, models that they don't need. So, it wasn't entirely perfect. So, what we did was improve on that solution, yeah. So the second solution, was to allow the user to specify not just the model they want, but also any relationships that they want to access. So, in this case, the same model here, but here you said, you want model one but you also want model two but you don't want model three. So, in this case the second pass would give you just model one and model two. The problem with this is that when you try to use this model in your application, if the API that you're talking to actually gives you a value, a key for model three, Ember data would actually throw another because it's trying to find model three, and it can't actually find model three because you haven't actually included model three in your input. So, Ember data will actually throw another saying you can't find model three. A couple of ways to get around that, what you would do is, you could only request the keys you need from the API, like, JSON API gives you the ability to have sparse field sets while you only request the fields you want. So, in this case, when you make a request to the API, you would say you only need model two and you don't need model three and the API only gives you model two. In that case, you don't have a problem. The other way to do it is to do some fancy stuff on your serializers, but get rid of model three somehow. The third way to do it is to basically use codemods. So, this piece of work that I'm going to show you was inspired by something called ember-watson. I don't know if you heard of ember-watson. Ember-watson is a tool that is basically a codemod tool that you can run on at Ember apps and it allows you to... It transforms your models, basically. So, like, if you want to migrate from the previous version of QENet with the latest version of QENet, or if you're upgrading from Ember one, the one that X Series would include on X Series, it provides you with some useful codemods that you can run on your source code, and it makes changes to the actual modules to become compatible with the other version that you're trying to upgrade to. So, it's basically a codemod tool. It's a very useful add-on. It's got a lot of like, useful codemods that you can use on your Ember apps to make upgrade process less painful. So, instead of you going to modules, finding out and replacing stuff, it can do that stuff for you. So, it was actually inspired by ember-watson. So, what we wanted to do was to solve this problem. We don't have support for sparse fields that's on APIs at the moment. We didn't want to muck about with the serializer there. So, what we actually wanted to do was, as part of this process, not only will the build include model one and model two, but it'll also Recast those models. But what I mean by Recast those models is, like, so, in this case, model one will actually be transformed to something like that. So, what you have here, is model two, it belongs to the relationship dev, as it is because you requested for it. But model three here is actually been replaced with a computed function. Since the relationship what it was previously, it's actually been replaced with a computed property that actaully throws an assert if you tried to access that. So in case you've forgotten to include model three in your code but you tried to access that relationship somehow, somewhere accidentally in your app, it would then give you an assert saying, "Please include model three to access this relationship." So, what we wanted to do was, not just do the tree-shaking, we also wanted to do a Recast of the code that's provided to the app so that they are transformed like this. So, yeah? - [Man] So, what was the assert rule in modern production? - Yes, it's a modern production, yes. Right. So, this is Recast, yeah? So, Recast, the definition from Wikipedia about what it is. We're not going to melt anything here, but Recast is actually a very powerful AST tool. So, AST is one of those things you get taught in computer science in your uni and then don't care about it beyond that. But when I think we started using this we could see more value about what actually ASTs are and what kind of values they can give you. So, basically, what Recast does, is that it takes your source, it has a pass method, it's quite simple, it has a pass method. You pass your source to it, and it gives you an AST, which is an abstract syntax tree. What you do is, you use that abstract syntax tree to make modifications to the abstract syntax tree, and then you can print that to modified source to your module. And so, basically using regular expressions or something like that to pass your coded string, what you get is a proper object that you can traverse and then make modifications to. So, a more natural way of making modifications to source code as opposed to treating your source code as strings, and using regXs and things like that. So, that's basically what Recast does, it gives you parse, it gives you print, the two methods that Recast gives you out of the box. I'm going to abstract syntax trees, let's talk a bit about abstract syntax trees. An abstract syntax tree, or just syntax trees is a tree representation of the abstract syntactic structure of the source written in the programming language. It's actually better if I show you what it actually is. So, let's have a look at some tools. So, one of the tools we found very useful is the AST explorer. It's a brilliant tool that allows you to visualize ASTs. You can copy and paste your models on the left, and then what actually it gives you is like, the AST on the right. So, with this, what you can do is there are different AST generation libraries at the moment. There is Esprima, and a whole bunch of other tools that allow you to create ASTs from programs. So, basically, what AST is that if you give it a program, what it'll give you is like, an object representation of the program. So, it deconstructs all the statements in your program, and gives you object representation of the program, which means that you can actually query your program like how you would query a normal object. You can get all the object expressions, you can get all the collies and things like that. So, it gives you like, a structured API contract with your source code. So, in this case here, for example, let's take a simple function like that. So, here that's a simple method that adds two numbers. So, here you can see that the AST for that, you get a type. So, this is the body of the function here, it has a return statement here, and then you can find out what the patterns are that go into the function. It has an identifier here as well which tells you what the name of the function is, and then you get the patterns that are parsed through function, which are the two of these. And then you actually have an expression within the body that actually represents what the function does. So, that is like the abstract syntax tree of a basic method. Now, if you think about it closely, you can relate it to things like what Babel does, yeah? So, what Babel does is when it does the transmutation from ES5 to ES6 or what I've seen recently with LeBob, I think, which does it the other way around. So, all these tools, they use ASTs intensively, to actually query the source code and then actually make modifications to the source code. So, here, as you see, you can see, you get an object that actually represents your program in a querible structure. So, you don't have to use, like, crazy LevelX expressions to access various areas of your program. So, in our case, what you do here is that so, say, this is the model that you have. So, in our case, what we're really interested in is finding out all these things. Oops. So, as you can see there. So, here you can see that the object properties, you have the call expression, the call expression is this, and then you have the member expressions here. So, one of the good things about ASTs is that you have several libraries that allow you to parse an AST very effectively. So, say, for example, in this case, we're only interested in looking at all the member expressions for example. So, because ASTs can have different types of objects in them, different types, you have the call expressions, you have member expressions, you have object expressions, you have different types in your AST tree, but say, in this case, for example, we are only interested in knowing the member expressions. So, there are several libraries that allow you to parse in AST effectively. So in this case, for example, what we did was we use the library called "AST Types." So, what that library will allow you to do is just reset all the member expressions in an AST tree. Because really, that is what we want in this case. We don't care about any other object expressions, we don't care about export declarations or whatever. All we want is to go through all the member expressions. So, they give you like, tree walkers, or things where you can just specify what type you want to visit in your AST tree. And then, you can pass a callback to that function and it will execute that callback for each of those types in the tree. So, like, they are really powerful traversal methods for you to traverse an AST tree. Once you've done that, all we do is, we find out if the model is part of the inclusion list. If it is part of the inclusion list, we don't do anything, we leave it there. If it is not part of the inclusion list, what we do is we actually replace it with a computed property here. So, as part of the three-four add-on method in ember.cli bill process, what we do is take the source, create an AST from it, we find out if there are any relationships. If the relationships contain a model that's part of the inclusions, then we don't do anything with it. Any other relationships, we replace it with a computed property at run time and that module is then provided to your application. So, in your application, what you get is basically what I showed you previously, is, here for model three you would basically get a computed property. And then, in your code if you tried to accidentally access that, you get an assert saying, "Please include model three" for you. So this has actually helped a lot with some of our smaller apps, which only required a few modules, a few models. So, like I said, one of our apps, we just called it "Retention app," only uses 3 models over the 40 we have. So it's been useful for those apps because they only now get what they want. It's reduced files sizes on those apps by up to 20% to 30%, improving performance of those apps, obviously. At the moment, we only do it for models, but now that we've seen what this can do, we plan to extend it out to other entities we have, shadow entities we have including companion services and all the other entities. So hopefully at some point, we'll have... Obviously the other bit of this is Ember itself. So how do you appreciate Ember modules? Actually, you can take this to the next level where you can identify dependencies dynamically because once you have access to the source code like that, you can basically query your objects structured like how you normally query. You can identify dependencies dynamically and add them only the included. You can do all sorts of stuff with this. With Ember itself, I think we have a module RFC, which is from Tom Dell, which is trying to like, split Ember itself into modules, so you only import the modules you need. And then, hopefully, ember.cli will be able to tree shake only the models it needs, so you don't get all of them, but you only get the pieces of Ember you use in your app. So, tree shaking I think is a concept that's here to stay. It has good performance implications for your apps. And, like I said here, it's actually quite easy getting started with it. You can get started with the Broccoli Funnel. There are several libraries like Recast which make creating ASTs, interacting with ASTs very easy. Hopefully you would find a use for it in your apps. That's it. Thank you.