AudioVision: A Stereophonic Analogue to Visual Systems

Abstract: AudioVision is designed to take a visual representation of the world–inthe form form of one or more video feeds–and convert it into a related stereophonic audio representation. With such a representation, it should be possible for someone who has minimal or no use of their visual system to avoid obstacles using their sense of hearing rather than vision. To this end, several different vision algorithms, including single and multiple image disparity, disparity from motion, and optical flow were investigated.

read more...


AudioVision Update

The quarter is ending and so is my current work on AudioVision. I have successfully managed to convert a basic two camera view into stereophonic 3d audio, using OpenCV (C++). I hope to continue this work some time in the future, so keep an eye out here for any future developments.


AudioVision Update

Since deciding that I cannot use MATLAB because of the additional addons necessary to use webcams, I have been deciding between C# and Python as the next language to try. I’ve settled on Python for the time being, using VideoCapture to connect to the webcams and Numpy to process the data. It turns out that Python + VideoCapture + Numpy is actually rather similar in functionality and syntax to MATLAB with its image processing library.

read more...


AudioVision Update

The original plan to use Make3D for the visual depth determination has mostly fallen through, partially because it has several dependencies that I cannot get to build correctly and partially because it is written in a combination of C and MATLAB. I have nothing against either of these languages; however, I do not have the addons necessary for MATLAB to connect to a webcam. As such, I’ve decided to switch from a monocular vision algorithm to a more traditional stereo vision algorithm.

read more...


AudioVision Overview

I am taking an independent study course this winter in Image Recognition / Computer Vision. The primary goal of my independent study is to look into determining depth information from video feed(s) in real time and then representing that depth information using a 3D audio map (headphones).

read more...