An engineer tests a Speech-Enabled Augmented Reality (SEAR) system. A prototype SEAR interface (below) shows an observed area (top), camera view (bottom right), and the status of the speech recognition engine
Today, finding a malfunctioning piece of hardware—be it in an off-shore oil platform, chemical plant, refinery or manufacturing facility—can be as complicated as navigating the mythical Labyrinth at Knossus. Typically, a problem first becomes evident with a "ping" as a light starts blinking on an impressive array of monitors in a control room. Someone is then dispatched to track down the offending hardware and identify the problem. But neither task is simple. The process can eat up a significant amount of time, and many cell phone calls later, documentation must be retrieved and an order for a new part filled out.
A new technology called Speech-Enabled Augmented Reality (SEAR) developed by Siemens Corporate Research (SCR), in Princeton, New Jersey, holds the potential for a radical improvement of this entire process. Based on a system now being tested at SCR, here's how SEAR would work in an industrial installation: A maintenance engineer working in a remote section of a chemical processing facility receives a message on his PDA, or on a wearable or mobile computer that a pump in section X is malfunctioning. The engineer's movements are tracked by combining inputs from infrared beacons located in each area of the facility and a three degree-of-freedom inertia tracker in his Compaq iPAQ Pocket PC. About twice a second each beacon transmits a unique ID, which is detected by the PDA's IR port. This causes a VRML (Virtual Reality Modeling Language) browser to produce a VRML model (a realistic-looking 3D image) of the corresponding scenery file and viewpoint on the PDA's screen. The engineer thus sees where he is standing in regard to his surroundings and an arrow indicates where he should go next.
Since the PDA has access to the facility's geographically referenced and digitized database—the most important and complex part of installing SEAR—the engineer's target object is known to the system. As a result, the PDA can use written or voice commands to direct the engineer to that object. Once in the area of the object, visual markers provide more precise location information.
Each marker has a unique combination of dots, making it look like two sets of tic-tac-toe boards. Held at the same angle as the engineer's head, a camera in the PDA orients the mobile device based on any markers in its vicinity. "This solution allows us to calculate the exact position and orientation of the user," explains SCR's Nassir Navab, Ph.D., who led the development of the location detection technology. "Even if the object in question does not have a marker, the system can still identify it because it knows the user's location and orientation and is constantly comparing it to a 3D model of the facility."
Once the problem object is located, "the system starts downloading the associated information into the cache memory of the engineer's mobile computer. The information appears on his mobile automatically, including the kinds of questions he can ask," says Navab, adding, "If, for instance, he is standing in front of a pump, it knows which pump and displays questions about pressure, contents, temperature and maintenance history." Depending on the underlying database, there is no limit to the number of questions that can be asked.
3D Voices: Once a question such as "What is your current pressure?" is asked, "SEAR uses Siemens' Very Smart Recognizer voice recognition engine to process the information. The resulting data passes through a wireless access point to a database server. The server searches its database for the appropriate information, which is transmitted to the PDA. The PDA then generates a 3D voice into the user's headset that sounds as if it is coming from the object in question," says SCR's Dr. Stuart Goose, who developed the patented voice control system. "The ability to ask questions verbally and to receive spoken responses is important in situations in which the user needs to have both hands available," explains Goose.
Does SEAR technology require a major overhaul of a facility's information infrastructure? Not necessarily. Many facilities already operate on SAP software that's based on object inventory numbers and associated databases, and some power plants have 3D models that are connected to such a database. Furthermore, many factories already run on WinCC, a Siemens automation program that controls programmable logic controllers. "The thing is," says Navab, "that today this information is shown in an abstract version on monitors. But the point is that the information exists, and that in such facilities, every object is already in the database. So in order to be efficient, our system has to talk to the WinCC and SAP systems."
And once that "talking" gets started, a lot can change, particularly in terms of maintenance and repair scenarios. Not only is each and every object in a SEAR facility in the database, but each could also have its own Internet address. As a result, information regarding each part could be harvested and compared with information from identical or competing parts in similar facilities. Maintenance and repair data could be compared and expert systems developed based on the resulting data. "By the time SEAR becomes commercially available in four or five years," says Navab, "it may be possible for an engineer confronted with a difficult problem on the factory floor to interrogate the part's maintenance history, send a multimedia e-mail to the person who most recently serviced it, ask an expert system for an opinion and, based on the responses, click the part's Web address and order a replacement. It can all happen on the spot with zip lost time."
The applications for such a system are virtually limitless. They range from asking your PDA to direct you to the nearest flower shop on your way to a date, to complex and as yet unexplored scenarios in which multiple users, such as firefighters or security personnel, visualize each other's movements as they move toward a common target. "This is a generic technology," says Navab. "All we can say now is that it is very likely to make many processes much more efficient."
Arthur F. Pease
Since 1999, Germany's Federal Ministry for Education and Research (BMBF) has funded a 21-mill.-€ program called ARVIKA (Augmented Reality for Development, Production and Servicing). The program is led by Siemens' Automation and Drives (A&D) Group. Eighteen industry partners and five research institutes working in areas such as visualization, tracking, portable systems development, and the development of intuitively based man-machine interfaces are involved in the program. "ARVIKA focuses on augmented reality (AR) applications that will support development, production and servicing of complex technical systems," explains Project Director Wolfgang Friedrich. "We expect AR and its related technologies to significantly benefit manufacturing in terms of improving analysis and maintenance operations as well as overall efficiency."
Patented speech recognition software allows users to "talk" with objects in their immediate vicinity through a wireless connection to a geographical database
AR superimposes relevant and location-specific information from a digital database over part of a user's field of view. Using a head-mounted display (HMD) or data glasses outfitted with a camera whose output is wirelessly connected to a facility's database, a user may, for instance, verbally request such a system to tell him how much torque to apply to a given bolt or what the current function, content and pressure of a pipe or pump is. "AR can be an outstanding training and support tool for service personnel," adds Friedrich. "For instance, the comparison of real installation work with simulation results could lead to comprehensive service process optimization, which would both improve the quality of work planning and simplify and accelerate diagnostic and repair work on the factory floor."