Sessions is temporarily moving to YouTube, check out all our new videos here.

You're Only Supposed to Blow the Bloody Doors Off

LĂ©onie Watson speaking at Bristol JS in May, 2017
Great talks, fired to your inbox đź‘Ś
No junk, no spam, just great talks. Unsubscribe any time.

About this talk

Using code examples and screen reader demos, we will look at accessibility mechanics in the browser, the new Accessibility Object Model (AOM) JavaScript API, and how to use JavaScript so you only blow the bloody doors off!


- Good evening, my name is Leonie Watson. For those of you I haven't met before, I work for a company based in the states called The Paciello Group, or TPG on less formal occasions. They, by my own good fortune, pay me to do a lot of work with the W3C, and so I'm on the advisory board for the W3C. But most of my time is spent as co-chair of the web platform working group, so we're the working group that look after the HTML specification, DOM, shadow DOM, custom elements, service workers and probably a whole bunch of other things that you've either come across or used in your day to day jobs. My background, however, is here in Bristol and in accessibility, something that's important to me as a consumer because I'm blind, which means I pretty much can't see any of you, never mind anything like technology or mouse cursors or any of those things, but it's also something that I just think is a growing importance for everybody who develops and creates products. I firmly believe that as developers and designers we create products because we want people to use them, for a whole bunch of reasons. And we want people to be able to use them successfully and we want a lot of people to be able to use them successfully. That's pretty much what's right at the core of accessibility. If you don't get a chance to catch up with me later, my contact details are up on the screen, you can find me in various places, drop me a line any time with some questions, I'm always happy to talk about this stuff. So I'm gonna pick up on a quote from one of my favourite films, The Italian Job, because it describes for me the power that we wield as developers in a lot of respects but in this case accessibility. - Five, four, three, two, one, go. You're only supposed to blow the bloody doors off. - So as developers in particular, we have a lot of technologies at our disposal and two I'd like to talk about today are JavaScript and ARIA. You'll often hear people say "Can JavaScript be made accessible?" Or "JavaScript isn't accessible, we can't use it." The simple truth is, like all technologies, it isn't the thing itself, it's how you use it that makes the difference and you can use JavaScript and ARIA for both extremely good things and with remarkable ease to blow the entire car up if you're really not careful. So hopefully by the end of this talk I'm gonna give you guys some idea of how to finesse the way you use JavaScript and ARIA for accessibility, so you only blow the bloody doors off and not the entire car up. To do that, I'm gonna tell you a little bit about the assistive technology that I use, it's a screen reader, if you haven't come across one. It's a piece of software that translates what most people see on screen into synthetic speech. There are four key players really in this scene at the moment. There's the Jaws screen reader from Freedom Scientific, or a company now called VFO. There is Narrator, which comes integrated with Windows 10. Voiceover, which comes integrated with Apple products, including Mac OS products. And NVDA from NV Access, which is a free and open source screen reader. All but VO are therefore Windows screen readers. And all of these technologies have different capabilities but essentially they do the same basic thing, they enable someone who can't see what's happening on screen to engage, interact with and enjoy content in the browser and across their operating system. Windows screen readers, in particular, do something quite unique when they encounter a webpage or a document in the browser. They grab the entire thing and then store it in a virtual buffer. And they basically do this for efficiency, screen readers need to query a lot of information, as we'll come to see over the next few slides, and if they had to make a call to the browser every time they wanted to find out the accessibility information about some node on the screen, it would be a remarkably inefficient way to do things. So they grab this copy of the screen and store it in a virtual buffer to make that interaction point, that querying of information, a lot more efficient from the user's point of view. What this means is that screen readers can grab all kinds of information and, as I say, translate them into synthetic speech, like chunks of text. - [Synthetic Voice] ARIA, accessible rich internet applications is a suite of specifications from the W3C. Knowing which specification has the information you need isn't always obvious, so this post briefly-- - So you can use screen readers to read entire paragraphs, sentences, words, characters, lines, an entire document from top to bottom if that's what you particularly want to do. You can also get them to acknowledge different HTML elements in the browser, and use them as navigation hooks. So for example, with a screen reader, you can use the H key to navigate by headings in the page. - [Synthetic Voice] Level one. Quick guide to the ARIA specifications. Heading level two, link. Using the aria-current attribute, heading level, two, link. - And you'll have noticed there that not only did it tell us something about the heading level, based on whether it was the H1 or H2 element that was being used, but it also recognised that these particular headings have a link inside them. If that link had been visited, the screen reader would also have picked up on that information and communicated it to the user. For a lot of these shortcut keys, they repurpose keys that we typically use for typing. I mentioned H for headings, most screen readers will let you do things like L for lists, T for tables, G for graphics, I for list items, a whole bunch of things, pretty much every key on the keyboard has some kind of screen reader specific purpose when you really get down to it. So what the Windows screen readers do in particular, when you need to actually use those keys for the things like typing, is that they start parsing those key strokes straight back through to the browser, the screen reader basically ignores them. So an example is form fields. - [Synthetic Voice] T-E-Q-U-I-L-A. Tab, which type, combo box, blanco, reposado, joven. - So in that one in the edit box, the screen reader was just parsing the keys for tequila straight through, they were just appearing exactly as you'd imagine that they would. In the case of the select the combo box, when it's focus was moved to that field, the screen reader started parsing the arrow keys straight back through to the browser. So normally in screen readers, the down arrow key will read the next line or the next object of content, but when that key stroke's parsed straight back through to the browser, it does pretty much what you'd expect it to do from a keyboard user's point of view and start scrolling through those options. The two different screen readers on the two different platforms, they have a keyboard intercept and it works in slightly different ways. Windows screen readers will detect a key stroke. If they need to use it for one of their shortcuts or something, they'll grab it and use it and if they don't they will then parse it back through to the browser for it to be used for the website, for the application, or anything else that it may be needed for. Mac screen reader works in a slightly different way, in fact in almost the entirely opposite different way. It listens for key strokes and it basically says "Right, if the browser or the application "or the content doesn't need it, "then we'll parse it back through to the screen reader." This is actually a somewhat more efficient model than the way the Windows screen readers do it because it stops this particular problem happening. A lot of websites, Twitter for example, use JavaScript to provide keyboard shortcuts. Because of the keyboard intercept chain on Windows screen readers, these JavaScript shortcuts don't work, because the screen reader detects the key stroke, decides it needs it for itself and says to hell with the browser and the JavaScript and anything else therein and of course the shortcuts don't work. So on Twitter for example, you can use the J key to move to the next tweet in your stream. - [Synthetic Voice] Jump to line dialogue. Enter line number one to 954, end. - Except in this particular screen reader, which is Jaws, the J key gets swallowed because it opens and navigates to line key. So that keyboard shortcut just simply won't work with Windows screen reader users. There is a bit of a dirty hack you can employ if you're a windows screen reader who really knows their screen reader. You can actually tell your screen reader to ignore the next key stroke that you hit and send it straight back through to the browser. To be honest I've never come across a user in the real world who isn't some kind of IT practitioner who knows about these key strokes. But this is basically how it works. - [Synthetic Voice] Tweets The Paciello Group, The Paciello Group. A quick guide to ARIA specifications from @LeonieWatson to-the-aria-specifications/ #a11... - So essentially it just said parse the key through and then it did just that and then the keyboard shortcut worked as expected, screen reader acknowledged it and started reading the next tweet. But if you've gotta hit parse through command and then the key you want every time, it really isn't that much of a usable shortcut. So what do we do in terms of a solution for this one? The solution is to make sure that there is a physical alternative within the interface that someone using a Windows screen reader can use. So this is Twitter's version. - [Synthetic Voice] The Paciello Group, link. Tab, May 17th link, 5:41 pm May 17th 2017. Tab, More button menu. Tab, @LeonieWatson link, tab, tab #a11y link. Tab, #WAIARIA link. Tab, #APG link. Tab, #AAM link. Tab, Tweet actions, tweet actions reply button. Tab, tweet actions retweet button. Tab, tweet actions like button. Tab, The Paciello group retweeted Leonie Watson. Reading @GitHub issues just got easier with a screen reader, now each comment ha-- - Finally. So there is a catch. The trick is if you're gonna use JavaScript to create keyboard shortcuts, make sure that it's part of an enhancement to the default functionality to the primary functionality. But try to make sure that that default functionality is usable. That was a hell of a lot of tab key pressing to get from on tweet to another. Now fair enough I chose a pretty extreme example with a lot of links and other focusable things on the way through the twitter website. But if you're a keyboard user or a Windows screen reader user and that's all the activity you've got to get through to get to the next tweet, you're pretty soon just gonna give it up as a bad job unless you're really committed. So how does this all work under the hood? Well it all starts literally at the operating system level. If you look at a basic UI component in the operating system, this one's a checkbox from somewhere in Windows 10, and inspect it from an accessibility point of view, you'll find out that it's got some standard information available about it. It has a role that describes what it is, in this case a check box. It has an accessible name which describes what the check box is for, in this case it's Bold, same as the visible label and there's information about its states, in this case it's a focusable component. It is focused and we also know that it's been checked. And this information can be queried by assistive technologies like screen readers, using platform accessibility APIs, and these are available on every platform. Windows, in fact, has three on the go at the moment, Macs just have one, IOS has one, Android has one, Linux has one. These are not JavaScript APIs, however, they are only usable at the moment by assistive technologies. But they can be used to find out pertinent information about the objects on screen. If we were then to recreate that check box in code in the browser, just some simple HTML, and inspect it for accessibility information through the browser, we'd find exactly the same information, it has a role of checkbox, an accessible name courtesy of its label element of Bold and that it's a focusable element, it's focused and it's currently checked, courtesy of the Checked attribute. So that information, no matter whether this check box was part of a platform component or part of a web component, the information should, in theory, be exactly the same. And these relationships between standard software and UI components and their web counterparts are documented in a series of documents called accessibility API mappings documents. There's a core mappings document. The one probably most relevant to us as developers at the moment is the HTML mapping document. So if you want to know what an HTML element maps to in terms of what the APIs will retrieve about it in terms of information, this is the best place to start looking for that information. There's one thing to be aware of though, there are tiers of accessibility support in a browser. A browser may chose to support an element, it may also then chose to accessibility support an element and a screen reader or other assistive technology may then chose to make use of that accessibility support and support it in its own right. is a site run with help from Microsoft by a friend and colleague of mine at TPG, Steve Faulkner, and it documents the level of browser support for accessibility of HTML5 elements and it's a really good resource if you want to find out whether something is viable in accessibility terms. So a good example for a long time were the details and summary elements in HTML5. Some browsers didn't support them at all, still don't. Other browsers did support them but they didn't make any accessibility information available to assistive technologies, so they weren't good on accessibility even though they worked functionally. So when the browser gets hold of a document, it parses the code and it creates the document object model, something you'll be incredibly familiar with, I'm absolutely sure. What it also does is create an accessibility tree, it takes a bunch of the information out of the DOM, creates another hierarchichal structure, but this time it contains all the accessibility information, of the kind we were looking at with that checkbox. The role of each element, it's accessible name if it has one, accessible description if longer information is available about it, and information about its state, whether it's checked, pressed or any of the other things that are otherwise probably visually available. And these APIs that I was mentioning before can be used to query the accessibility tree in the browser, so this is how screen readers and other assistive technologies get the information we heard in those first few clips. That's how they know what a heading is, what a link is, what an input box is, all of those different kind of things. The question then is what do we do if we're building something that doesn't have native accessibility? What if we're using a JavaScript framework that prefers to recreate links using spans? Because that's a really good idea, but it happens. What if we're trying to create something that doesn't exist in HTML, like tab panels or progress bars or other bits and pieces where there isn't good support and we have to start from scratch? You can actually polyfill accessibility using a technology called ARIA, stands for accessible rich internet applications. It's a suite of attributes that you can use with HTML and also, if you're interested, SVG, to polyfill the roles, accessible names and descriptions and other state information about custom components and widgets that you're building. The first rule of using ARIA is actually don't use it. If you can use native HTML or whatever host language, SVG, then do it. Because you'll get a lot of stuff for free. All of that role, accessible name and state information I was talking about earlier, you get that for free. So if you use a link, you get its role of link, you get the fact that you can focus on it with a keyboard, the screen reader knows that it's a link, if it's visited the browser'll communicate that too. And you just do that all for a very simple bit of HTML, so to make things easier on yourselves as developers, the best thing you can do is let the browser and the code do as much of the accessibility heavy lifting as you can possibly let them get away with. So a quick example, if we just take this really simple button example, because what would a code talk be without a button example? Then we can just take a look at how a screen reader will report that button if it comes across it. - [Synthetic Voice] Tequila button. - So it's really simple, it just recognises that it's a button element and it recognises its accessible name based on the text that was inside the button element. It's really easy and very limited in terms of the effort you need to do it. There are some times of course, like I said, when you do have to polyfill some of this accessibility information. This is the real world after all. And the two elements to watch out for are the div and span elements. Natively they don't communicate any accessibility information through the browser to the APIs and to assistive technologies. So if you're starting to build something with one of these two elements as your primitives, your building blocks, then that's a real flag that you need to start thinking about putting in place some accessibility information. So if we take this span example, pretending to be a link or a button or whatever, then this is how a keyboard user would be able to interact with it. Short answer is that they can't. A span element can't be focused on, the browser doesn't make it focusable, it's not a native, interactive element in HTML. So if you're just a keyboard user, you're pretty much off to a non-start. Screen reader users have other sneaky ways of navigating so for example another one of those shortcuts, B, will often jump you to the next button in a page if you use the Jaws screen reader. This is what they would get if they happened to try jumping to this particular button, though. - [Synthetic Voice] Tequila. - Just the word tequila, because there's no accessibility information available about a span. The browser makes nothing available to the screen reader so it's totally oblivious to the fact that is looks like a button and if you happened to use a mouse probably behaves like a button as well. But from a screen reader user point of view, it's just a piece of plain text that certainly I would blithely ignore and carry on through the page, trying to find the thing I was trying to get accomplished. So we can add some semantics using ARIA, I mentioned the role attribute, we can use that with a role of Button. We can also use the tab index attribute with a value of zero to make the thing focusable for all keyboard users. Tab index can actually take a couple of different values. Zero is the one to use if you wanna make something focusable at its location, based on its position in the DOM. If you're tryna make something focusable, nine times out of ten, zero is the best way to do it. If you wanna take something out of the keyboard sequence, you can give it tab index of minus one. Use this really carefully. You've really gotta be sure that you do not ever want a keyboard user to focus on this thing before you use that tab index thing. The reason we use it most often actually, minus one, is because it remains usable in scripting. So if you move focus to an object using JavaScript, you can still do that to an object that has tab index minus one. The value for tab index, you almost never want to use is a positive number value because that forces a tab order on the page. The thing that has tab index one will be the first focusable element on the page, even if it's the last thing at the bottom of the page that anybody might chose to focus on. So once you start taking control of the tab sequence, you've got to keep control. And that means forever, every time you maintain the page, update it, change something, move it around. If you're using any of the CSS layout properties and you rearrange things visually on screen, you've got to think about tab index again. So it's a real headache from a development point of view and the chances are it'll be a pain for users too. So in this instance, tab index zero is the way to go. And it makes things quite a bit different for both keyboard users and screen reader users. - [Synthetic Voice] Tequila button. - And so again, it's pretty much identical to the information that we had within native HTML. The screen reader recognises that it's a button because it's focused on it and it's queried it and picked up all that information and made it available. One thing you wanna do with ARIA, or rather don't want to with ARIA is break native semantics. It's important to understand that ARIA only affects the browser's accessibility tree, it doesn't affect the DOM. So it's really easy to confuse a screen reader user into thinking they're dealing with one thing when they're actually dealing with another. So if we had that example there, I just changed the button into a heading, I just gave it a role of heading and an ARIA level of one. And this is what happens from a screen reader user's point of view. - [Synthetic Voice] Heading level one, tequila. - So my screen reader has just now told me that this is a heading level one. Except of course it's still a button and if I should try to click on it or use the enter key on it, it will probably still behave like a button. So we've really just royally screwed things up from an accessibility point of view. So be really careful when you apply roles to things, don't override a native element's semantics unless you're really sure it's what you want to be doing because the results can be really quite catastrophic in terms of confusing the hell out of users. When you use ARIA and particularly when you're thinking about custom widgets and components, you need to script in keyboard functionality, because unless you're using one of the simple native interactive elements provided by the browser, like links or buttons or form fields, you've gotta think about keyboard support and JavaScript yourself. Sometimes it means you'll want to supplement existing interaction, so a common pattern that we see is something that is an anchor link under the hood in the HTML, but it's styled to look like a button, it's a really, really common design pattern. From a keyboard user's point of view, the expectation is that you can activate a button with the enter or space keys, but a native link can only be activated with the enter key, so in the scripting we have to supplement that interaction provided through the link in the browser, with the expected interaction for a button. So in this case, all we've done is just added in event handlers for key down and we're listening out for the space and the enter keys and then we're doing whatever it is that the button needs to do. We're just supplementing the existing action. There are also times when you're gonna need to provide all the interactions. So again, if you're starting with your divs and spans, you're gonna need to do all of that, so there's nothing from the browser to work with so no enter key support at all and you have to provide both enter and space key support. But the, oo. - [Synthetic Voice] Button collapsed, enter expanded, blank, makes me happy. - So when we put all of this together and we provide all the keyboard interaction, the ARIA, the semantic polyfill and all the other bits and pieces, we actually get a widget, which in this case it turns out is a disclosure widget, that's well supported by screen readers, keyboard users and is actually functionally correct and looks pretty much the way we'd expect the thing to look and work for mouse users, of course touch users as well. So quick reminder of the keyboard intercept chain, because it becomes really relevant with ARIA and some specific roles. So remember that the screen readers either listen out for the keyboard commands and then if they don't want them, parse them back through to the browser or they do it they other way around. You can catastrophically blow up the entire car using one role. It's the application role. It basically removed the screen reader from the equation entirely. So when you stick role equals application on a container, whether that's the body or a div somewhere inside your code, you are effectively removing the screen reader, the user has no access to any of those shortcut keys, it is pretty much reduced back to using the tab key for navigation. So use this very, very lightly. The rule of thumb is that if you use role application anywhere, you need to provide not only all the keyboard interaction that you would expect a keyboard user to want, but also all the shortcuts and other bits and pieces that a screen reader user is typically gonna want to have used to get around that widget that you've just built. And believe me, that's an awful lot of work. At a rough calculation, it takes about 20% extra code to make a very, very simple thing like a button accessible in terms of polyfilling it and providing the scripted functionality. If you upscale that to say a full fledged web application, you're talking about a lot more time, money and testing involvement that you need. There are a whole bunch of other roles, just to make life interesting from a developers point of view, not to mention a users point of view, that will also trigger applications mode. So they'll basically trigger the same behaviour in the screen reader. These include the grid role, tab list, tree, tree grid, all of the roles that basically are affiliated with creating interactive components. In other words, components that have complex keyboard interactions. It's something that we actually need to work on at W3C is a better list of exactly which those roles are because believe it or not, such a thing doesn't really exist with any clarity. Several attempts have been made but it's something we definitely need to do because it is incredibly important. So quick example, I mentioned tab list was one of the roles, and it's a very common design pattern that we see on the web where ARIA is used to polyfill accessibility. By putting the role of tab list on this UL element, we're converting it from a list of items into a list of tabs. We're using our role of presentation on the list item because once we've made the conversion of the list into a set of tabs, we don't need those list items, they're just there for fallback purposes and roll presentation says "If you're treating this as a set of tab panels, "then just ignore this list item element". We override the anchor element with a role equals tab. This is one of the times when you can be reasonably sure it's okay to override native semantics, 'cause of course if there's a rule in coding anywhere, two minutes later there's a slide telling you to do exactly the opposite. But this is one of those times and in this case we're just saying "Don't treat this thing as a link, treat it as a tab." The reason I show you this is because this is what happens if we apply that ARIA but without really doing anything about the JavaScript and the keyboard interaction. - [Synthetic Voice] Blanco, tab selected, use Jaws key plus alt plus M to move to a controlled element. - So we could tab onto that first tab, we get told it's a tab and we get read its accessible name, the blanco tequila, but that's it. We can't move between these tabs, there's nothing we can do to select any of the other tabs, the screen reader uses none of the obvious key strokes, we can't use left and right arrows, up and down arrows to move or cycle between them as you'd expect to in say a tabbed interface on software. The only thing we can do is just tab out of this again and hope there wasn't anything terribly interesting that we wanted to read about. So we need to provide the keyboard interaction. In this case, we need to listen out for the different arrow key strokes, again, in the same way that we did with the button, listen for the key down events, and the key codes for left, right, up and down. And when we find them, enable them to change focus and move the screen readers focus and keyboard focus through the tabs in pretty much the way you'd expect them to operate. - [Synthetic Voice] Tab, Blanco, tab selected, use Jaws key plus alt plus M to move to a controlled element. Reposado tab selected, use jaws key plus alt plus M to move to a controlled element. Joven tab selected, use jaws key plus alt plus M to move to a controlled element. - And there, quite simply, we can now access all of the different tabs and we could if we wanted to, move into the contents. That last announcement from the screen reader, use Jaws key plus alt plus M to move to the controlled element, gives users of the Jaws screen reader a shortcut for moving into the content of the tab panel itself. And so we need to provide that functionality to make this widget as usable for screen reader users as we do everyone else. There is something new on the horizon, just a final thought I would like to leave you with, and that's the accessibility object model. It's a JavaScript API that will enable us as developers to interact with the browser's accessibility tree. It's being put together by developers from Google, Chromium team, Firefox and Apple, Safari team. It's in early incubation at W3C at the moment, there's a GitHub repo for it if you're curious where it outlines the plans. But what it'll enable us to do is not only to query that accessibility API for information ourselves, and make use of it within our applications, it'll also actually enable us to insert nodes into the accessibility tree, change the information, and otherwise manipulate it in pretty much all the ways that assistive technologies like screen readers can manipulate it now. If we thought that with JavaScript and ARIA alone it was easy to blow the entire car up, when we get our hands on this little API, in accessibility terms we're gonna have even more opportunities to catastrophically blow things apart form a user's point of view. This is a really promising and interesting way forward for accessibility. It's gonna solve a lot of the problems that we have between incompatibilities between different platform accessibility APIs. Think back to the CSS browser in compatibility days of maybe five, ten years ago and you'll have some sense of the problems that we're facing in accessibility terms at the moment. So I do urge you to go and check out this and if you're interested in this field, do file issues, provide feedback because I know the developers would really welcome hearing from people like you who use JavaScript every day and probably know it far better than those of us working on standards in real day to day terms. So, as last note, just remember, you're only supposed to blow the bloody doors off. Please don't go out there and blow the entire car up, I kinda need that car. It's remarkably easy to do, but you know what? This is the web, it's remarkably easy to screw up a whole bunch of stuff and we pretty much manage to dodge that most days, most of the time if the gods of coding are looking out for us somewhere. But just bear this in mind because it really does make a human difference when you're talking about it in accessibility terms. Thank you