Browse Mode and App Mode in Screen Reader

Browse Mode and App Mode in Screen Reader

The Gist

This blog is an effort to create a background of basic working of screen reading software in context of web as well as desktop applications. In desktop applications, a screen reader identifies User Interface elements rendered on screen and reads out the textual caption associated to the element. A screen reader user listens to this information to be able to easily navigate the Interface.
In context of the web though, it is different. Screen reading software internally makes use of virtual screen where it copies the content rendered on a web page. Then it reads out the content from this virtual screen.

Screen reader navigating a desktop application

Consider a document editing application (MS Word for example) within a desktop environment. The User Interface components it has include Menus, Ribbons, Buttons, rich text editing area and so on.

  1. These elements can be focused with help of a keyboard. Screen reader users cannot use mouse to point and click.
  2. Whenever any of these elements is focused, screen reader announces relevant information. For instance, pressing Alt+H within MS word shifts focus to Home tab. Screen reader speaks message like “Home tab. To navigate through this ribbon, press Tab or Shift+Tab.”
  3. Screen reading software can also speak out information within a given element. For instance, while within the rich text editor (viz.MS Word), it also speaks out information like Font, Font Size, Alignment and so on.

Unlike the web documents, the interactive elements like Buttons, Menus, Text boxes, Check boxes, etc in a word document are not usually embedded within the document itself. While in a web page, these elements are part of the web page itself. This is important to note in order to understand the 2 modes supported by a Screen Reader with web, which we are going to discus in details in the next section.

Screen reader navigating a Web Page

On the other hand, when browsing or navigating a web page, interactive form elements and rich widgets are part of the web document itself. This is where virtual screen of a screen reading software falls into picture. A screen reader user needs to be able to do the following to be able to use web efficiently.

  1. A screen reader user needs to be able to navigate and access web content in a web page quickly and efficiently. That is he should be able to identify and jump between regions like navigation, main, side-bars, header, footer, etc. He should be able to read data presented in tabular and graphic format.
  2. He also needs to be able to interact with interactive form elements or rich widgets with the same fluency as that of a regular user. That is he should be able to identify errors in form and rectify them, use rich widgets like tree view (expanding and collapsing items), grid (navigating cells) and so on.
  3. Whenever there is dynamic content being rendered on web page, he should be notified of such changes. That is chat messages being populated in a web page element, errors appearing around form elements, etc.

For this, a screen reading software provides 2 basic modes

  1. A mode that allows browsing to elements in a web page. For instance JAWS screen reader has Document mode, NVDA has Browse mode, etc. With this, a screen reader user can find elements like anchors, input fields, headings. So when in this mode, a screen reader user can press “e” to jump between text boxes and text areas, “b” to jump between buttons and so on. In this mode, the virtual screen is used. The web content gets copied to virtual screen when page loads. It is inside this virtual screen where quick navigation keystrokes like “e” and “b” work.
  2. A mode that allows interacting with these elements if they happen to be interactive in nature. For instance, JAWS has Forms and Application mode, NVDA has Focus mode. With this mode, a screen reader user can interact with the focused elements. That is a screen reader user will be able to type within a text field, use arrow keys to expand and collapse a tree view or navigate grid cells within a grid.

Switching between Browse/Document mode and Focus/Application mode

Below are few scenarios where a screen reader generally switches between 2 modes. Examples are from Google Search and Google Drive.

Google Search – handling forms

When searching on Google Search, as soon as the page opens, JavaScript on the page sets focus to the search text field. Screen reader user automatically shifts to Focus/Forms mode. In this mode, screen reader user can type into the text field. If at all the focus happens to be in the links at the top of the page, screen reader user can press “e” to move to the search text field. Note that at this point, user will stay in Browse/Document mode. Pressing “enter” or “spacebar” will activate the Focus/Forms mode and allow him to type in the text field. If he doesn’t do this and continues typing , then his focus will jump around the web page considering all the keys as means of quick navigation. Note that this “enter” or “spacebar” is handled by screen reader and not by the browser or JavaScript. Hence the form doesn’t submit.

Google Drive – handling widgets

When logged into Google Drive, the focus is by default set on the grid of files and folders. Screen reader automatically shifts to Focus/Application mode. In this mode, screen reader user can use arrow keys to navigate the cells of a grid. When in Browse/Document mode, the arrow keys just read out content from virtual screen line-by-line or character-by-character. It is important to note the change in behavior of screen reading software in use of arrow keys and quick navigation keys when in Browse/Document mode and when in Focus/Application/Forms mode.

Web Accessibility for screen reader

The Accessibility for web content is also categorized as per the same 2 modes discussed above. Web Content Accessibility Guidelines (WCAG 2.0) are intended to address accessibility issues faced in the Browse mode whereas Accessible Rich Internet Applications (WAI-ARIA ) intends to address accessibility with respect to Application/Focus mode of screen reading software. Although this isn’t a hard and fast rule, this is how the two sets of Accessibility Standards differ from each other.
It is important to understand that while implementing accessibility, both the modes of a screen reading software are properly catered to. To achieve this, one must keep the following in mind.

  1. Screen reading software can read out content in focus.
  2. Screen reader user cannot identify layout as he sees the entire content in top-down sequence when in Browse/Document mode.
  3. If screen reader user needs to be able to use a rich widget, Accessibility is ensured using the WAI-ARIA standards. This allows to switch screen reader into Focus/Application mode.
  4. In either of the two modes, a screen reader user must be notified of
    1. important dynamic changes (e.g. form errors).
    2. Help on using a widget when in Focus/Application mode.
    3. How to switch to Focus/Application mode when in Browse mode.
    4. How to switch to Browse mode when in Focus/Application mode.
  5. When in Browse mode, the web content is easily discoverable (e.g. accessibility of non-text content or implementation of proper page structure).