Assistive technologies have helped millions of people who are blind or have low vision with a variety of daily tasks, from converting text to speech and summarizing complex documents to using public transportation and navigating unfamiliar environments.
On the other hand, developing technology that can help people with vision disabilities find frequently used and easily misplaced personal items in almost any physical space has proved much more difficult. Until now.
Find My Things, developed by members of the Microsoft Research Teachable AI Experiences (Tai X) (opens in new tab) team with the help of a citizen design team they assembled for the project, is a teachable AI tool designed to solve that problem. Now available as a feature of the World channel in Seeing AI (opens in new tab)—Microsoft’s phone app for the blind and low-vision community that uses an intelligent camera and the power of AI to describe people, text, currency, color, and objects—Find My Things makes it easy for people to use their phones to recognize and locate the personal items they use every day. Find My Things was recently honored in both the accessible design (opens in new tab) and artificial intelligence (opens in new tab) categories of Fast Company’s 2024 Innovation by Design Awards.
Unlike other object recognizers that are pre-programmed to recognize a collection of generic objects, Find My Things gives people the power to personalize their experience by teaching the tool to recognize the items they actually use. This may include small items like house keys or earbuds, medium-sized items like backpacks or travel mugs, and even items that sometimes change shape, such as charger cords or a folding guide cane.
“I have a few small personal items that I always carry with me, and most of them are very important to me. Losing my bus pass means I can’t go anywhere. Losing my house key means I can’t get home. Because these items are small and I use them daily, it’s not unusual for me to lose track of them. With Find My Things, I can locate my personal items if they happen to be lost.”
– Karolina Pakėnaitė, Citizen Designer
The Find My Things experience in the Seeing AI mobile app consists of two parts: teaching and finding. To teach Find My Things to recognize a specific object and add it to their list of findable things, a person uses their phone to record four short videos of the item they have selected—from various angles and against different backgrounds. While shooting the videos, the person receives feedback from the tool to help them stay focused on the target object. The videos then serve as training data for the tool’s few-shot object-recognition model and can be personalized on that individual’s device in just a couple of seconds.
Using the tool to find an item is even easier. A person simply selects an object from the list of findable things they’ve created and then uses their phone to scan the environment—whether it’s their living room, a coffee shop, or a neighborhood park—until the tool locates the object. The tool then uses audio, visual, and vibration cues to guide the person to within arm’s reach of the object.
“Find My Things is one of the first fully working systems available to members of the public that allows people to teach an AI system to meet their own needs,” says Cecily Morrison, a senior principal research manager at Microsoft Research, who led the Find My Things project and was honored in 2020 as a Member of the Order of the British Empire for her services to inclusive design.
In this YouTube video, Theo Holroyd demonstrates how blind and low-vision people can teach the Find My Things AI-powered tool to recognize and locate their lost or misplaced personal items.
Making AI teachable—and personal
Morrison and her colleagues at Microsoft Research started exploring the potential of teachable AI several years ago. “In 2016, we posed a challenge for ourselves: ‘How can we use AI to augment human capabilities?’ We began by trying to figure out how AI could extend the already formidable capabilities of paralympic athletes who were blind or low vision.”
As the team worked to define the AI system they would need to tackle that challenge, they also kept in mind that blind people are extremely diverse—and so are their needs. “As researchers, we knew it would be impossible to build a system that would suit everyone’s needs, especially as there were sure to be many needs we wouldn’t know about,” Morrison says. “We needed to create AI techniques that would give people agency in shaping systems that could meet their own needs. Teaching an AI system by providing user-generated training examples is one way to do that.”
Few-shot learning, an area of machine learning research that helps to enable teachable AI, is one of the foundational elements that made it possible for the Tai X team to develop Find My Things. Researchers use few-shot learning to reduce the number of examples needed to complete a machine learning task and enable AI models to adapt to diverse, real-world use cases. For example, adding a new object category to a typical deep learning model would require hundreds or thousands of high-quality examples whereas a few-shot model would require only five or 10 examples. The publication of new datasets such as ORBIT—a collection of videos of personal objects that blind and low-vision people recorded on their mobile phones—also helped researchers apply few-shot learning to real-world challenges and develop the Find My Things app.
Listen or read along as Microsoft Research Podcast guests Daniela Massiceti and Martin Grayson discuss the inspiration for Find My Things, the advances that made it possible, and larger lessons for building more inclusive AI experiences.
Citizen designers: Providing real-world guidance
As part of their accessibility work over the years, members of the Tai X team have often asked blind and low-vision people to test new technology intended for that community. For the Find My Things project, however, Morrison and her colleagues took a different approach.
“We invited eight blind or low-vision young people between the ages of 14 and 25 to join us as citizen designers. Our aim was to train them in the design process by having them work alongside our team as collaborators and co-creators, and to give them a voice in shaping a technology they could use in their daily lives. We also hoped this experience would give them the skills and the motivation to move into the technology field.”
– Cecily Morrison, Senior Principal Research Manager, Microsoft Research Cambridge
Over a four-month period, the Tai X team hosted three day-long workshops with the eight citizen designers. These sessions focused on user scenario development, the teaching experience, and the finding experience. In addition to interacting directly with citizen designers during the workshops, the Tai X team recorded the sessions, analyzed the recordings, and used what they learned to help shape the Find My Things design. As the work progressed, the citizen designers offered many insights that influenced the final outcome.
The citizen designers represented a diversity of vision disabilities—from people who have little or no vision, to those who still have varying degrees of sight, to those like Pakėnaitė who have multiple disabilities. Those differences influence how they experience and move through the world, which, in turn, often determines how they use technology.
For example, three of the designers preferred to read and write Braille and use screen readers to access their phones. The other five were print users with partial sight, who rely more on magnification. The Tai X researchers observed that those who preferred Braille held their phones horizontally while the print users usually held their phones vertically and at a 45-degree angle. Understanding that behavior helped the researchers create a video-capture process that works well for everyone regardless of how they hold their phone.
“The citizen designers were able to combine their new knowledge with their lived experience to help us make critical design decisions,” Morrison says. “For example, they helped us shape key use cases, design the video-capture experience, and A/B test different experience options. This led to insights such as not insisting that users follow sound cues to take a good teaching video, but rather just letting them know when the item goes out of frame.”
Analyzing the workshop recordings and engaging with the citizen designers also helped uncover the value of adding vibration and other tactile cues to Find My Things rather than relying solely on audio and visual feedback. “Having tactile feedback like vibration is very beneficial for people like me who are deaf-blind, but also for others who may not want to attract attention and feel self-conscious if their phone starts talking at high volume in a public place,” Pakėnaitė says. “Adding those tactile cues will benefit me and everyone else.”
On the Microsoft Research Podcast, Cecily Morrison and Karolina Pakėnaitė discuss how collaboration between researchers on the Teachable AI Experiences team and citizen designers helped make Find My Things an effective tool.
Designing AI solutions for marginalized communities
Teachable AI helps people create meaningful personalized experiences by training AI systems to meet their individual needs. This is especially important for people whose needs do not always conform to those of the broader population, such as people with disabilities or from cultures with too little digital information to successfully train most AI models. Teachable AI offers a way for researchers to rethink current AI systems, develop inclusive solutions, and create human-centric experiences for millions of people who might otherwise be left behind.
According to Daniela Massiceti, a senior researcher at Microsoft Research and the lead machine learning engineer on the Find My Things project, technical innovation means breaking a challenge into multiple steps to make solving the problem manageable. But that can sometimes encourage researchers to think that all they need to do is solve the first step and then scale the solution. While this may serve most people, perhaps 80%, that same solution may not work for people whose needs or constraints are different than those of the majority.
“It’s easy for researchers to kind of parachute in, thinking they know what a marginalized community needs, and then build what they believe is a useful solution based on those assumptions. True engagement and working alongside members of the community during the development process is really important for understanding the actual needs and constraints and ensuring the technology you’re building is what the community wants.”
– Daniela Massiceti, Senior Researcher, Microsoft Research Cambridge
“In the spirit of inclusive design—design for one, extend to many—working with populations that have unique needs and constraints significantly improves innovations for everyone,“ Morrison says. “It moves us beyond incremental development and allows us to make innovation leaps.”
“At Microsoft, our commitment is to serve everyone on the planet. That means everyone, not just 80% of users,” she says. “When we build AI solutions to work for marginalized communities, we often need new approaches to ensure that our systems scale to all users and not just the top slice. This removes Microsoft as the arbitrator of who AI systems are designed for and gives people and communities the agency to ensure that, whoever they are, they can adapt an AI system to fit their needs.”
At CHI 2024, researchers presented their paper on how teachable AI can help people with disabilities
Next steps
The development and release of Find My Things coincided with the explosion of generative AI models that are driving a new wave of research. While Tai X researchers continue to focus on accessibility and working with marginalized communities, they are also trying to understand how generative AI models are working or not working for these communities.
“It’s an exciting new era and very cool to be working with these new AI models,” Massiceti says. “We need new ways of thinking about technology when we’re building solutions for marginalized communities.”
The Tai X team is exploring how working with communities to gather training data in a fair and equitable way can improve the performance of these models and create new opportunities for community members. In addition, the team is exploring how bringing various aspects of teachable AI and personalization to generative AI experiences could make them even more powerful.
Find My Things is an example of how research at Microsoft often enhances Microsoft products and services. To try the Find My Things tool, download the free, publicly available Seeing AI app (opens in new tab).
Explore more
Insights into the challenges and opportunities of large multi-modal models for blind and low-vision users: A case study on CLIP
Daniela Massiceti delves into the transformative potential of multimodal models such as CLIP for assistive technologies. Specifically focusing on the blind/low-vision community, the talk explores the current distance from realizing this potential and the advancements needed to bridge this gap.
Timeline: Assistive technology at Microsoft Research
Over the years, research teams have collaborated closely with people with disabilities and those who support them to thoughtfully and creatively innovate around their commitment to inclusive design and accessible technology.
Empowering people with AI
Microsoft Research’s Cecily Morrison joins the podcast to discuss what she calls the “pillars” of inclusive design and how her research is positively impacting people with health issues and disabilities. She also shares how having a child born with blindness put her in touch with a community of people she would otherwise never have met.
Inclusive design for all
Ed Cutrell, principal researcher in the Ability Group, talks about his work in the disability and inclusive design space, explains the vital importance of interdisciplinarity—a fancy way of saying many ways of thinking and many ways of knowing—and tells us how a dumb phone beat a smart tablet in rural India … and what that meant to researchers.
Story contributors: Mary Bellard, Neeltje Berger, Tetiana Bukhinska, David Celis Garcia, Matt Corwine, Kristina Dodge, Martin Grayson, Theo Holroyd, Alyssa Hughes, Daniela Massiceti, Matthew McGinley, Amanda Melfi, Cecily Morrison, Karolina Pakėnaitė, Joe Plummer, Brenda Potts, Carly Quill, Katie Recken, Patti Thibodeau, Amber Tingle, Sarah Wang, Larry West, and Katie Zoller. Find My Things usage photos by Jonathan Banks for Microsoft.