Authors:
Jaylin Herskovitz、Andi Xu、Rahaf Alharbi、Anhong Guo
Paper:
https://arxiv.org/abs/2408.10499
Introduction
In the realm of assistive technologies, visual access tools have significantly improved the lives of blind and visually impaired individuals. However, these tools often cater to common scenarios and lack the flexibility to be customized for unique, personal needs. This limitation imposes additional cognitive load on users, who must adapt their usage to fit the tool’s capabilities. Addressing this gap, the paper introduces ProgramAlly, a novel end-user programming tool designed to empower blind users to create and customize visual information filtering programs. ProgramAlly leverages three end-user programming approaches: block-based programming, natural language programming, and programming by example, providing a versatile and accessible platform for creating personalized assistive technologies.
Related Work
Information Seeking in Assistive Technology
Assistive technologies have long aimed to help users locate specific information within visual scenes. Early solutions like VizWiz allowed users to submit images and questions to human assistants for answers. More recent technologies, such as Find My Things and VizLens, use object recognition to help users locate specific items or buttons. However, these tools often provide general descriptions, which can overwhelm users seeking specific information quickly.
Methods for Personalizing Assistive Technology
Personalization in assistive technology typically focuses on adapting input and output mechanisms to user needs without altering the core functionality. For instance, teachable object recognizers allow users to train models to recognize personal items. However, commercial applications often limit customization options, leaving users with few avenues to tailor the technology to their specific needs.
DIY Assistive Technology
DIY approaches in assistive technology emphasize personalization, democratization, and collaboration. Research has primarily focused on physical tools, such as 3D-printed devices and custom prosthetics. High-tech DIY projects, like the Blind Arduino Project, have also emerged, but there is limited research on DIY software systems for existing devices.
End-User Programming
End-user programming enables non-professionals to create programs for personal use. Approaches like block-based programming, natural language programming, and programming by example have been developed to make programming more accessible. However, these methods have not been extensively applied to visual assistive technology, presenting an opportunity to explore their potential in this domain.
Research Methodology
Design Goals
ProgramAlly was designed with three primary goals:
- Expressiveness: The tool should support a wide variety of real-world use cases through a flexible structure and range of models.
- Approachability: ProgramAlly should be accessible to non-experts, offering multiple modalities for creating and iterating on programs.
- Accessibility: The tool must be VoiceOver and Braille display accessible, providing context for each statement and visual feedback while running programs.
Visual Filtering Programs in ProgramAlly
ProgramAlly’s visual filtering tasks are based on real-world scenarios encountered by blind individuals. The tool uses a generalizable representation of filtering tasks, allowing users to create programs with statements like “find NUMBER on BUS.” Programs are stored as lists of items, each consisting of a target (e.g., object, text) and optional adjectives (e.g., color, size, location).
Running Programs
Programs are executed by iterating over the list of items, using object detection and text recognition models to filter and process the source image. The tool provides descriptive output, indicating where target items were found or where filtering failed, helping users understand the scene and aim the camera.
Experimental Design
Participants
The study involved 12 blind participants, recruited through email lists, prior contacts, and snowball sampling. Participants varied in age, occupation, and level of visual impairment, ensuring a diverse sample. They were required to have an iPhone to download ProgramAlly via TestFlight.
Procedure
Participants were introduced to ProgramAlly and asked to modify a pre-written program to familiarize themselves with the interface. They then used each of the three programming interfaces (block mode, explore mode, question mode) to create and run programs for specific tasks. Remote participants used sample images, while in-person participants used props like books, grocery items, and packages.
Data Collection and Analysis
The study sessions were recorded, and participants’ strategies for completing tasks were analyzed. Qualitative data on participants’ workflows and feedback were collected, focusing on their experiences using ProgramAlly and the unique challenges they faced as blind end-user developers.
Results and Analysis
Using Filters in ProgramAlly
Participants generally found filtering programs useful, particularly for routine tasks. They appreciated the ability to create specific filters that existing assistive technologies could not provide. For example, participants preferred using ProgramAlly over Seeing AI for tasks like reading addresses on packages, as ProgramAlly provided more focused and relevant information.
Programming Process and Challenges
Participants faced challenges related to unknown parameters and object classes. They often debated the specificity and reusability of filters, balancing the need for detailed information with the desire for generalizable programs. The block-based programming mode required participants to adopt a programming mindset, breaking down tasks into components and understanding the structure of programs.
Comparing Creation Modes
Participants appreciated having multiple programming modes, each suited to different scenarios and levels of expertise. Block mode offered fine-grained control but had a higher learning curve, while question mode was faster and more approachable but sometimes produced ambiguous results. Explore mode helped participants discover new visual features but required careful selection of targets.
Benefits and Drawbacks of DIY-ing Assistive Technology
Participants valued the customization and control offered by ProgramAlly, enabling them to tailor assistive technology to their specific needs. However, some expressed concerns about the effort required to create programs, suggesting that a platform for sharing programs could enhance collaboration and reduce the burden on individual users.
Overall Conclusion
ProgramAlly demonstrates the potential of end-user programming approaches for creating and customizing AI-based assistive technologies. By providing multiple programming interfaces, the tool empowers blind users to create personalized visual filters, addressing unmet needs and enhancing their control over assistive technology. The study highlights the importance of balancing expressiveness, approachability, and accessibility in designing such tools, and suggests future directions for improving ProgramAlly’s utility and expanding its capabilities.
ProgramAlly represents a significant step towards democratizing AI technology creation, enabling blind individuals to customize their experiences and achieve greater independence. As assistive technologies continue to evolve, tools like ProgramAlly will play a crucial role in ensuring that these advancements are accessible and tailored to the diverse needs of users.