In September of 2007, Axion Racing engaged Terra Soft Solutions to assist with the introduction of a Sony PS3 running Yellow Dog Linux to replace one of the on-board Dell servers for realtime, stereoscopic vision. Terra Soft's Bill Mueller had just ten days to pull off a very hard task ... and he did. The following is Bill's account of his work, start to finish.
Architecture Overview
Two logitech quickcam pro 5000's connected via USB to the PS3 running YDL 5.0.2. The PS3 is connected to Spirit's Dell server rack over 100Mbit Ethernet. The PS3 captures the images, processes the data, and sends a message to Spirit indicating the presence, distance, and general direction of obstacle.
Stereo Vision Concept
The two cameras are placed side by side. Pictures from the right and left cameras are taken simultaneously. This data is fed into an algorithm that detects apperant amount of shift detected by objects in the pictures. The algorithm outputs a disparity map. The greater the shift, the higher the value in the disparity map. How is this useful?
The object is seen by both cameras. But because the cameras are seperated, they see slightly different views of the same object. As an object comes closer, the left and right cameras view is very different. As an object moves into the distance, the cameras see a greater portion of the front.
You can test this yourself by placing your hand 6 inches from your face. Alternate closing your left and right eye. Notice your hand appears to shift right and left. Now move your hand out far away and repeat. It doesn't shift as much now, right? Same idea here.
Software
The logitech quickcam pro 5000 cameras were installed using the ucv V4L2 camera drivers The images were captured using fswebcam. The disparity map generation relied heavily on the previous work of Stan Birchfield. The image manipulation and conversions came out of the latest release of Netpbm. The Many other utilities were tried, and will be the topic of howto's in the future.
Initial Lab Tests
.450 s to open a connection to the camerasThe algorithm is highly vectorizable with little effort as it is one big matrix calculation. The camera is by far the weak link. We can add another 2 cameras to cut it in half, or try bursting data from the camera. But I chose to do single open, read, close transactions for stability. Also, removing the extra disk I/O at the conversion stages would speed it up as well.
< .1 sec to process the images/disparity maps (PPU only)
The First Road Test
![]() |
![]() |
![]() |
Adding Object Direction Detection
Once it was apparent the system was doing object detection, the next step was to get a direction and a distance measurement off the captured images. This was
done by calculating the amount of disparity across the image and filtering out the ambiant noise. I tried streching the source image from the left camera
across the bottom for reference.
The yellow map shows calculated distance in feet. The red area displays how good the data is (higher value == less noise detected). The blue area displays actual object detection. Above zero values means no object detected. Below zero means a collision is imminent.
As you can see, the backs of the chairs and my head is detected fairly well (as indicated by note 1). Note 2 shows where there was too much noise and we got no data. This is an image of completely raw data. Subsequent tests included a filter to smooth it out a little.
One point to note here, high contrasting background noise fakes out the algorithm a bit. To the far right, you can see a lot of bouncy data from where the edge of the closet is detected. As noted previously, this can be reduced with proper calibration of the cameras but not completely eliminated.
Camera Mounts
![]() |
![]() |
![]() |
San Diego Road Test
Back to the Index














