Eye-paratus

Building Spatial Augmented Reality for Explainable Robot Vision


2023
For: Harvard GSD Mediums Proseminar
Role:  Hardware design, software integration
Special Thanks: Matte Lim (Harvard)






Concept

In our very recent history, the surge of large digital image datasets has fueled the advancement of deep-learning algorithms in computer vision. It has thus far been fixated on teaching computers to process,analyze and understand digital images and extract usable information from the real world through image processing. However, the social and cultural aspects of the machine’s ability to see the physical world remain largely unexplored.




Eye-paratus addresses the disconnect between humans and machine perception, focusing on our challenge in intuitively “seeing” what a machine sees through its camera and algorithms. It explores the potential of projection technology as a new way to navigate this socio-technical landscape.

A spatial merging of perceptions between human and machine.





Eye_paratus explores the interaction between the human and machine perception by combining light projection and algorithmic processing to externalize machine vision into the physcial world.






a) What the Machine Sees
(OpenCV Object Detection)


b) Externalising What the Machine Sees
(Projection Mapping onto Real Objects)





Mechanical Systems


a) three stepper motors controlling rotation, tilt, and focus
b) three gear based movements


c) one Lidar and one Depth Camera






Integration Pipeline




Lidar, Depth Camera & Motor Control

Camera + Motor
Focus Motor
Projector Auto-Focus
Rotation
. Lidar Tracking Nearest Object
Real-time Updated Target Rotation



Human-Projector Interactions

A series of experiments was done to explore the interaction one might have with the projector. How does the machine see? Where does it see? How does it communicate what it is seeing to us in an intuitive manner?




Objection Detection + Projection Mapping (OpenCV)

Eye-paratus operates on Mediapipe’s object detection model to recognize 80 distinct objects. By analyzing camera inputs frame by frame, it defines what it “sees” and only information on these “seen” objects are saved and the rest gets thrown out. Conversely, what it fails to detect, it doesn’t see”.


1. Projecting red boundingbox to highlight detected objects
2. Removing White Background to create a more immersive object highlight


As it projects its perception, the illuminated frames signify visibility, while the unilluminated parts of the projector and the space not lit by the machine, especially in the absence of artificial light, become enveloped in “artificial darknses”. This darkness becomes sites of invisibility, a space where objects and elements outside the machine’s programmed recognition remain unseen and unacknowledged.
3. Red box is removed with white outlines to further enhance visual overlay
4. Green dot represents “target of sight” for the machine


Kevin Tang © 2019 - 2024. All Rights Reserved