Abstract: AudioVision is designed to take a visual representation of the world–inthe form form of one or more video feeds–and convert it into a related stereophonic audio representation. With such a representation, it should be possible for someone who has minimal or no use of their visual system to avoid obstacles using their sense of hearing rather than vision. To this end, several different vision algorithms, including single and multiple image disparity, disparity from motion, and optical flow were investigated. In addition two different methods of mapping the resulting disparity map to stereophonic audio–maximal poiints and sonar scan–were implemented. The results are rather promising. Using Lucas-Kanade optical flow and sonar scan audio has fulfilled the aforementioned goals in simple tests.
If you’d like to read the full paper you can do so here:
There’s also an early set of slides on the same topic from my time at Rose-Hulman: