Ximira - Technology

PHINIX: An Open Source, Visual Assistant for the Blind

Authors: Ben Weimer, Jagadish Mahendran, Breean Cox

Ximira's PHINIX (Perceptive Helper for Independent Navigation and Inclusive eXperiences) system is a collaborative effort among team members who are both blind and sighted, actively engaging and incorporating their shared experiences and perspectives. This inclusive approach ensures a user-centric design that caters to the needs and preferences of the very community it serves. With the invaluable support of Accenture and Intel, the team is dedicated to developing a system that empowers individuals with visual impairments, providing them with expanded choices and the ability to make decisions according to their own preferences. This wearable, edge device integrates a voice user interface (VUI) and haptic user interface (HUI) to create an immersive experience within three-dimensional spaces. The PHINIX system offers a range of features, including obstacle detection, pathway guidance, object description, optical character recognition, and facial recognition. Embracing open-source principles, Ximira's PHINIX system encourages collaboration and customization, enabling developers and users to explore new possibilities and tailor the device to their specific requirements. The PHINIX system is an open-source software (OSS) and open-source hardware (OSHW) project, emphasizing its commitment to community involvement. Recent technological advancements in wearable technology, such as increased computing power, miniaturized components, and advanced sensors, have paved the way for the PHINIX system. Leveraging Artificial Intelligence (AI), Ximira's vision-based intelligent assistive technology can be seamlessly integrated with existing tools to create a versatile mobility solution for users with visual impairments. As an edge device, the PHINIX system performs real-time processing and analysis of captured visual information directly on the device, ensuring low latency. Operating independently of an internet connection guarantees continuous support for users in areas with limited or no network connectivity. Edge computing eliminates the need for constant video data streaming, reducing data usage fees. Additionally, edge devices prioritize data privacy and security by processing and storing sensitive information locally on the device itself, enhancing user confidentiality. "I am fueled by an unwavering desire for more than mere functionality, but for the ability to embrace spontaneity and playfulness in my daily adventures. This project represents our unwavering commitment to create a device that not only prioritizes safety but also unlocks the freedom to explore, create, and revel in the beauty of unexpected moments. Together, let us build a future where my blind peers have access to information on their terms." – Breean Cox, Cofounder

Our Story

The project began as a collaborative effort between a computer vision engineer and an individual who is blind. Together, they developed a proof of concept, resulting in receiving first place in Intel's OpenCV Spatial AI competition in 2020. Inspired by this success, the team expanded, welcoming more volunteers, and establishing Ximira LLC as a platform to drive the project forward. The team’s goal was to create a minimum viable product that could have a tangible impact on the autonomy of individuals with Visual Impairments. With the support of Intel and Accenture, we envisions releasing the product to the community in 2024 as an open-source initiative. At the core of Ximira's work lies the advancement of PHINIX — a wearable, edge device, powered by AI. Equipped with a chest mounted, AI-based depth-sensing camera, an Intel computing unit, and a purpose-built carrying bag, the PHINIX system converts visual information into real-time audio and tactile formats, all processed locally on the device.

Barriers Together

The barriers faced by over three million visually impaired Americans in employment, independence, and community access are undeniable [1][2][3]. These barriers stem from a world predominantly designed for sighted individuals, creating significant obstacles to equal participation. Despite previous attempts to tackle these challenges, the adoption of mobility tools and accessible technology remains limited. This can be attributed to several factors, including the lack of user-centric design, affordability constraints, and compatibility issues. Additionally, reliance on proprietary software often restricts users from personalizing the technology to meet their specific accessibility needs. With the support of Intel and Accenture, Ximira is committed to overcoming these barriers and providing practical solutions. By combining cutting-edge technology with a user-centric approach, we aim to develop accessible and customizable tools that promote greater mobility and foster inclusion.

Open Peripheral Integration

Modern technology plays a significant role in developing assistive technology, mobility devices, and wearables for the Blind community. These advancements leverage computer vision, AI algorithms, and innovative hardware to enhance navigation, access to information, and overall independence. Glasses mounted devices, such as OrCam [7], utilize computer vision and AI algorithms to provide audio feedback to the users for text recognition, face recognition, and object identification. Smart canes, such as WeWalk [8], incorporate GPS, and haptic feedback to detect obstacles and provide navigation assistance. Mobile applications, such as Seeing AI [9] provide audio-based navigation, object recognition, optical character recognition (OCR), and access to digital content. In the field of augmented reality (AR) and virtual reality (VR), headsets offer audio and haptic feedback to users. Collaboration between technology companies, research institutions, and the Blind community remains vital. By involving end-users in the design and development process, these advancements can better address the specific needs of each user. The flexibility offered by our software allows for the integration of existing peripheral devices, allowing developers to create ROS2 nodes or Docker containers that encapsulate the functionalities of peripheral technologies and establish communication channels with the PHINIX system.

PHINIX - Enhancing Accessibility

Compatibility and interoperability challenges can hinder the development of accessible technology. PHINIX embraces open-source software as a solution, enabling developers to adapt the software to different platforms and accessibility requirements. This approach ensures comprehensive solutions and aligns with accessibility principles. Limited employment opportunities contribute to higher rates of poverty among individuals who are Blind [4]. Ensuring affordability of accessible technology is a key factor in reducing financial barriers. Ximira is actively working towards affordability by engaging volunteers, utilizing open-source software and hardware, and conducting cost analysis during hardware development. With a deep understanding of user needs, Ximira prioritizes user-centered design. As a company owned and operated by individuals who are visually impaired, they have firsthand experience and insights into the challenges faced by their users.

The Technology of PHINIX

PHINIX is designed to meet several critical system characteristics, including being hands-free, mobile, adaptable, modular, multimodal, unobtrusive, compact, and cloud optional. The system comprises five core hardware components: the compute module, camera module, wireless haptic bracelet, wireless open ear headset, and mobile phone/tablet application. The compute unit features Intel 13th Gen Core, Raptor Lake; Intel Iris Xe Graphics card; Intel Movidius Myriad X chip; and Intel Wi-Fi 6E AX210 with vProBluetooth 5.3. The camera module features a stereo depth camera and wide high-resolution color camera; with IR dot projection, IR illumination; and an embedded Intel Movidius Myriad X chip. The compute and camera modules will be connected through a USB cable. The software components include open-source platforms such as Ubuntu 20.04 OS and ROS2 Foxy robotics framework packaged in a containerized docker format to enable portability across various platforms. In addition to providing excellent communication protocol, ROS2 enables our system to perform advanced robotic operations required for navigation. AI frameworks such as PyTorch, TensorFlow, ONNX, OpenCV, Python and DepthAI are heavily used for perception, computer vision, speech recognition and text-to-speech operations. Intel’s OpenVINO tool is used to optimize the AI models enabling them to run locally on the edge device with high inference speed in real-time.

Meeting The Needs of the Community

The PHINIX system can perceive the environment accurately in 3D using depth and point cloud data, to detect possible obstacles in the user’s path. Furthermore, the system allows users to recognize and identify objects in their surroundings, such as cars, bicycles, or animals. Another feature is text recognition and reading. By leveraging optical character recognition (OCR) technology, the system can capture and interpret text from various sources, including signs, menus, documents, or labels. Lastly, the system incorporates facial recognition capabilities to facilitate social engagement. The user interface offers three modalities: auditory/voice (headset), haptic/gestural (bracelets), and graphical user interface (GUI)/touch.

Looking to the Future:

Ximira is committed to driving innovation and collaboration with the open-source community to develop a range of exciting features for the future of the PHINIX system. These include advanced indoor navigation utilizing Visual Simultaneous Localization and Mapping (VSLAM) technology, personalized map creation with the ability to easily share maps among users, and environmental element recognition for seamless identification of doorways, elevators, and stairwells. Additionally, Ximira aims to enhance occupational task assistance by incorporating capabilities such as detailed document OCR, handwriting OCR, and currency recognition. The vision also encompasses intuitive gestural control to provide users with a natural and non-verbal interaction method, as well as an integrated digital voice assistant to offer responsive and intelligent conversational support. By fostering collaboration and leveraging the power of the open-source community, Ximira strives to transform these aspirations into tangible advancements, expanding the frontiers of accessible technology. Additionally, we are dedicated to developing an API-driven solution that seamlessly integrates the product with external systems. As we move forward, Ximira remains steadfast in our mission to leverage cutting-edge technology, and collaboration with the community to refine and expand the capabilities of the PHINIX system.

References

[1] Diverseability Magazine (2021). The Buying Power of People with Disabilities. Available at: https://diverseabilitymagazine.com/2021/12/make-holidays-accessible/

[2] McDonnall, Michele C., & Sui, Zhen (2017). Employment and Unemployment Rates of People who are Blind or Visually Impaired: Estimates from Multiple Sources 1994 - 2017. Published by The National Research and Training Center on Blindness & Low Vision, Mississippi State University. Available at: https://www.blind.msstate.edu/sites/www.blind.msstate.edu/files/2020-04/McDonnall__Sui_%282019%29_Empl_and_Unempl_Rates.pdf

[3] NFB (June 2019). Blindness Statistics. Available at: https://nfb.org/resources/blindness-statistics

[4] Donovan, Rich (September 1, 2020). Report Summary: The Global Economics of Disability. Published by RETURN ON DISABILITY, Design Delight from Disability. Available at: http://rod-group.com/sites/default/files/Summary%20Report%20-%20The%20Global%20Economics%20of%20Disability%202020.pdf

[5] Praveen, R & Paily, R. (2013). Blind Navigation Assistance for Visually Impaired Based on Local Depth Hypothesis from a Single Image. International Conference on DESIGN AND MANUFACTURING, IConDM 2013.

[6] Jagadish K. Mahendran, Daniel T. Barry, Anita K. Nivedha, and Suchendra M. Bhandarkar. Computer vision-based assistance system for the visually impaired using mobile edge artificial intelligence. In CVPRW, 2021. https://openaccess.thecvf.com/content/CVPR2021W/MAI/papers/Mahendran_Computer_Vision-Based_Assistance_System_for_the_Visually_Impaired_Using_Mobile_CVPRW_2021_paper.pdf

[7] OrCam Technologies. (2023). OrCam: Assistive Technology for People with Blindness and Visual Impairment. Retrieved from https://www.orcam.com/en-us/home

[8] WeWALK. (2020). WeWALK - Smart Cane for Visually Impaired. Retrieved from https://wewalk.io/en/

[9] Microsoft Corporation. (2023). Seeing AI: Talking camera for the blind and low vision. Retrieved from https://www.microsoft.com/en-us/ai/seeing-ai