Microsoft Seeing AI Usability Study

November 2018 • Team Project for Accessibility Considerations in HCI

Purpose: A common challenge faced by individuals with visual disabilities is identifying features of everyday objects, such as text, labels, colors, signage and products. From reviewing existing research, we found that there are a number of free or cost-effective applications that use Optical Character Recognition (OCR) to identify features (e.g., colors and text) and hear them read aloud. However, the accuracy of OCR devices varies and is dependent on factors such as ambient lighting, distance from a target surface and the quality of the camera.

Microsoft Seeing AI is designed to help users recognize objects, identify colors, and much more, but there is a lack of research available on the usability and accuracy of the Microsoft Seeing AI. For this study, we aimed to address this gap in the literature through a pilot usability study involving four participants with a visual disability.

Method: Usability Testing

Tools: Stormboard (group affinity diagramming), Google Sheets

Full Report: Microsoft Seeing AI Usability Study


—Overview—

For this usability study, we evaluated the effectiveness and user satisfaction of three key functions of the Seeing AI application. While the application does provide assistance to people with visual disabilities with day-to-day tasks, we found a number of areas in need of improvement. Key issues to address include: providing more actionable audible feedback when an object is not in camera view or focus, and an integration of the text translation tools.

 

—Method—

Participants

Four participants were recruited through the Chicago Lighthouse for the Blind and Visual Impaired and Usertesting.com.

Participants were from the Chicago Metropolitan area or Seattle, Washington.

All participants were familiar with using an iPhone and three were familiar with Seeing AI.

Participants had at least one of the following conditions: blind, macular degeneration, deteriorating eyesight, blurred vision, depth perception problems, or corneal abrasions.

Tasks

Participants were asked to complete 3 scenario based tasks. All tasks were audio and video recorded for data analysis.

 

—Data Analysis—

The team inductively coded our notes and then sorted the codes using Stormboard, an online affinity diagramming tool to identify common and salient themes. 

We compared time on task and participant success (pass/fail) for each task.

data analysis.png
 

—Findings—

Table 2: Results from the 3 tasks

Table 2: Results from the 3 tasks

Task 1, Translation of a Greeting Card, achieved a task success rate of 50%. Factors included, target in-frame, target distance, channel selection and target orientation.

Task 2, Product Identification via barcode scan, saw a 75% success rate with one user giving up. Factors included target in-frame, target distance, channel selection and barcode database.

Task 3, Color Identification, saw a 75% success rate with one participant giving up.  Factors included target in-frame, target distance and a high level of false positive responses from the environment.

Aggregate results including time-on-task and ease-of-use responses available in the table to the right.

Regardless of the goal, the majority of observed participant challenges can be anchored back to four common themes:

  1. Tool Inaccuracy

  2. Camera Usability

  3. Output Confusion/Ambiguity (Feedback, AI Responses)

  4. Feature Confusion/Ambiguity (Tool Names, Guidance)

Detailed Findings

(1) Tool Inaccuracy

  • A correct response by the app is given only after multiple incorrect responses and false positives.

  • Handwriting was read as gibberish or only a few select words devoid of context prompting participants to ask questions like P2’s: “Is the card written in English?”

  • Product barcode database is incomplete and the tool was observed erroneously attempting to scan a target label’s graphics vs the label barcode.

  • Items in the environment were frequently scanned, such as that of the table or the user’s skin and clothing. Occasionally our target apple was identified leading for frequent humorous exchanges such as “Brown or grey apples?” with P3.

 

(2) Camera Usability

Participant attempts to get an apple into her camera’s field of view for interpretation. In this image, app is scanning her skin color and table color.

Participant attempts to get an apple into her camera’s field of view for interpretation. In this image, app is scanning her skin color and table color.

  • The optimal distance from phone to target required by the Seeing AI app is inconsistent with other tools frequently used by persons with a visual disability. The optimal distance to scan objects differs from tool to tool within the app itself. For example, colors were best identified from only a few inches away, while reading a document requires the phone to be held at a much greater distance. But at larger distances, users were greatly challenged to correctly angle their phones to get a full target in frame.

 

(3) Output Confusion/Ambiguity (Feedback, AI Responses)

As shown, participant is unsure which way to move the can.

As shown, participant is unsure which way to move the can.

  • Seeing AI didn’t provide clear audio guidance to help participants align a target in frame or use a tool correctly. Seeing AI audio feedback such as “No Edges Visible” were not helpful for participants prompting questions to the research team such as “Am I close?” from P1.

  • Participants were frustrated with the audio beeping guide because it didn’t provide clear guidance as a user was moving close to a target, nor which way to move the target (up vs down, left vs right, clockwise, vs counterclockwise).

 

(4) Feature Confusion/Ambiguity (Tool Names, Guidance)

  • Participants had difficulty selecting which mode or tool such as short-text, document, and handwriting, to select, as they were swiping through each option and they ended up spending a lot of time navigating through the options before they were clear which to select.

  • When navigating, participants would often gesture at a faster rate than the Seeing AI app could audibly identify its channels, creating a sort of lag. As participant 4 describes, “swiping up and down through the picker is bothersome”.

  • In some cases, the study moderator had to remind the participant about the handwriting channel. Because our participants did not know what type of text was on the card, we observed that they would go through each of the channels until one provided some positive feedback, however the channels did not provide guidance to the user on which channel might yield a better result.

 

—List of Recommendations—

  • SeeingAI should not provide any random audio feedback.

  • App should have ability to flip into product function quicker.

  • Improve OCR function to better read text off a printed page.

  • The handwriting mode should have better integration with the camera to be able to fully focus on handwritten text to capture it.

  • Improve camera focus on object to be able to clearly capture it.

  • App should have limited beeping when identifying a product.

  • Improve speed & accuracy to identify objects and translating text back to users.

  • Color tool should have better accuracy capturing color of objects.

  • Improve speed & accuracy of swipe functions to be able to read channels or modes & translate to user quicker.

 

—Limitations & Future Work—

Small Sample Size

  • Four participants (3 in person and 1 remote)

  • A small sample size may not able to generalize convincing findings for the larger population

Environmental Factor

  • Participants had different devices and settings. For example, different screen sizes and quality of the camera may impact the test. One participant had a damaged screen, so some controls within app could have been affected

Future Study

  • Testing with a larger sample size, to gather more data to work with

  • Keep the condition and settings of the testing devices constant, to reduce the impacts on the results by using different devices

  • Changing the order of tests and use alternative actionable guidance prompts to assist users in successful task completion

  • This study only focused on three of nine tools within the Seeing AI app consider conducting tests with the other features