Machine Vision – 3D Object Recognition
Machine Vision’s In-Depth Picture
Machines are getting better at recognizing 3D objects. Progress in this area is being driven by stereo cameras, light stripe projection systems, laser scanners—and the intelligent software that makes them possible.
Cameras and laser scanners generate a 3D image of their surroundings—here in Dr. Frank Forster’s laboratory (large photo) and during the placement of heavy containers (bottom)
Dubai, free port. A fully loaded container ship has barely docked, yet it’s time to hurry. Wasted time is expensive. Towering gantry cranes purr and clatter as they lower their hooks and then hoist shipping containers, some of which are over 12 m long, ashore. Other giants approach on rail tracks and proceed to stack the cargo in the container storage area.
The cranes go through their motions as always. But this time something is different. Experts from Siemens Automation and Drives (A&D) are on hand and their attention is focused on a crane that’s about to maneuver a container onto the bed of a huge truck. Superficially, the crane looks no different from the others. But the A&D pros know what’s concealed in the crane’s trolley, which is slowly lowering the cargo toward the truck. Laser beams from the crane have detected the truck, scanned it, and created its three-dimensional image. As a result, the crane can now compute the vehicle’s position with great precision.
Visible light signals guide the truck driver precisely to his parking position under the gantry crane. The laser system verifies this position and communicates it to the crane’s control system, which then manages the unloading of the cargo onto the truck with precision in the centimeter range. The truck positioning system, which was developed by Siemens and tested for the first time in Dubai, functions flawlessly. In other words, these containers, which weigh many tons, can now be loaded and hauled away faster and more safely than ever.
The rail-mounted gantry cranes that stack containers in the storage area up to five high, in rows of six, will also work faster in the future. Siemens researchers plan to upgrade them so they’ll work fully automatically. Here too, laser technology will be useful, and preliminary tests are already under way. To enable the crane to deposit its load correctly in a huge stack of containers, the control system scans the container landscape and stores it as a 3D image. Using this image, the system automatically computes the quickest path to each storage place and the height to which the container must be hoisted in order to avoid obstacles.
Alois Recktenwald, product manager at A&D MC Cranes, is convinced that such driverless gantry cranes with 3D object recognition will be safer and more productive than their predecessors. "Even now, the safety and efficiency of the test cranes is several percent better than today’s standard," says Recktenwald.
Automated Forklifts. In addition to their work on rail-mounted harbor cranes, experts from Siemens A&D are also planning to equip forklifts with laser scanners. The driverless vehicles will then be able to cruise automatically through warehouses or factory shops without conventional floor-mounted guide wires. "By the end of the year, such autonomous vehicles will be ready for large-scale production," predicts A&D project manager Walter Beichl as he starts up a prototype in a warehouse near Stuttgart. "First it has to learn its route," explains Beichl as he maneuvers the forklift among the storage shelves in the building. As he does so, the laser beam emitted by the navigation system on top of the forklift continues to scan the surrounding space from floor to ceiling, scan line after scan line. From these scanned lines, which represent local geography, the forklift’s image processing software generates and stores a 3D model of the route’s environment.
In a subsequent automatic trip, an internal computer compares this map with new, live images detected by the laser scanner. The forklift precisely follows the route for which it’s been trained, including proper speed and steering angles. To achieve such accurate localization, the navigation system uses fixed landmarks on the ceiling, such as supporting beams, whose images it has stored during the test drive.
The forklift’s efficiency will be further enhanced when it also learns to reliably detect pallets in the laser image—and how to pick them up automatically. "Unfortunately, the required algorithms aren’t quite there yet," says Beichl. "They’ve learned to recognize pallets, but haven’t quite learned how to precisely align their forks with the pallet axes. As a result, they can’t accurately calculate the angle of approach to the pallet. But they’ll manage this feat within a few months." When that happens, information regarding the position and destination of each pallet will be transmitted by a master computer via WLAN to the forklift.
3D object recognition based on the use colored stripes—so called structured light—is useful in applications ranging from 3D face recognition to measuring suspension systems and ensuring a perfect fit for hearing aids
Closely Watched Packages. Vision systems are also set to transform postal sorting. At the Cologne-Bonn airport, for instance, Siemens has integrated an entirely new type of 3D object recognition system into its Singulator package sorting system. There, Siemens Industrial Solutions and Services has installed four Singulators for UPS to ensure efficient operation of package conveyor belts. Each Singulator takes an initially unsorted flow of packages of various sizes, all aligned helter-skelter, and sorts it into a single-file.
But to line up the packages, the system must recognize and record them. To this end, six video cameras are installed above the conveyor. The first pair are stereo cameras, which acquire a three-dimensional view of the packages. This initial image information is processed—in conjunction with signals from photodiodes integrated in the conveyor belt—to determine the size, location and orientation of the packages. As this processing takes place, mechanical sorting takes place on the conveyor belt, which is about 1.5 m wide.
Narrower side-by-side conveyor belts and rollers accelerate or slow down the packages individually to cause them to rotate and line up in a single-file. Four additional cameras monitor the scene to ensure that everything runs smoothly.
As part of his dissertation on color-coded triangulation, Dr. Frank Forster, a researcher at Siemens Corporate Technology (CT), developed a very promising method of 3D acquisition that’s already used in several Siemens products. The basic principle is simple. A projector illuminates the object whose shape is to be detected with a pattern of parallel, colored light stripes that are subsequently deformed according to the geometry of the object’s surface.
A camera records the resulting pattern, and in a fraction of a second a computer program composes a 3D image based on the pattern’s deformation. The method has a number of advantages: Since all that’s required are standard video system components, it is inexpensive to implement. What’s more, it generates a 3D record from a single video image, which means that it can be used for anything from facial recognition to detection of imperfections in manufactured objects.
Faces and Hearing Aids. Color-coded triangulation was initially used for facial recognition in access control systems (see Pictures of the Future, Spring 2003, Biometric Technology). The advantage of this method over other recognition methods, such as color images, is obvious: 3D recognition is more reliable, because the exact shape of the face is difficult to imitate. In order to quickly take advantage of Siemens’ technological lead in this area, a cooperative agreement was concluded between CT, Siemens Building Technologies and Viisage Technology Inc., the global leader in personal identification systems.
Color-coded triangulation is also being put to good use by other Siemens Groups. Siemens Automation and Drives (A&D), for instance, uses it in the latest generation of chassis tuning systems. In this system, the surfaces of rotating vehicle wheels are scanned. Within seconds, the system precisely measures the chassis with a view to improving the quality of the vehicle. After all, the more precisely the chassis is adjusted, the greater its safety and driving comfort, and the less tire wear. The system is used by BMW and Porsche, among others, and will soon be available to repair shops as well.
Color coding is also used in the iScan system, which was recently introduced by Siemens Audiologische Technik GmbH (S.A.T.). iScan allows hearing-aid acousticians to produce digital casts of the auditory canal that can be translated into in-the-canal devices. iScan scans the canal and converts the image into 3D data. Then, instead of mailing a physical cast for a hearing aid to a device manufacturer for digi-tization, they can send an electronic version by e?mail. That’s a lot faster, simpler and more secure.
Rolf Sterbak
As part of the Cognitive Aid System for Blind People (CASBliP) project, a research initiative supported by the European Union, Siemens is collaborating with universities and organizations for the blind in developing a sensor system that uses audio signals to endow sightless people with spatial perception of their surroundings. The concept for the sensor system, which is built into a pair of glasses, was developed by Siemens Corporate Technology (CT). "We originally developed the sensor for pedestrian recognition by cars," says project researcher Dr. Peter Mengel. "But this solution is also very well suited as an orientation aid for blind people." A laser diode in a special pair of glasses scans its surroundings with infrared light pulses up to five meters ahead, and with an angle of 60 °. Infrared light reflected by objects is detected by a tiny scanning camera with 64 pixels. Differences in elapsed time are converted into a distance profile of the immediate environment, which in turn is converted into audible signals. The shorter the distance to the object, the higher the pitch of the sound—and conversely, the farther the object, the lower the pitch of the signal.
Thanks to the different angles from which the infrared light is reflected, there is an audible right / left difference. By turning his or her head, an otherwise unaided blind person can gain a nearly complete impression of object distances in the immediate environment.