Auto Electronics – Vehicle Communications
My Car Understands Me
Automobiles are becoming increasingly intelligent. In tomorrow’s cars, engine performance will become less important as their ability to process and relay information grows. For example, a virtual passenger will talk to the driver, warn of impending traffic danger and even hunt down music on the Internet. To this end, Siemens is merging information and communications to generate infotainment.
Tomorrow’s vehicles will be online all the time. Navigation and entertainment systems will be fully networked and operated by voice commands
When drivers climb into their cars in the future, they will probably be met by a virtual passenger who inhabits the cockpit display. This avatar—a species of digital aide or companion—might first ask where the driver wants to go before reminding him or her of appointments and asking if the house alarm has been activated. "Of course, this kind of scenario goes way beyond what we can do today with our onboard infotainment system," admits Dr. Hans-Gerd Krekels, head of Product and Innovation Management for Infotainment Solutions at Siemens VDO Automotive in Wetzlar, Germany. "At present, cars feature a range of individual devices, such as radio, CD player, DVD changer, MP3 player and navigation system. The driver therefore needs to know which content can be accessed on what device," he explains. According to Krekels, however, this type of multi-device setup is destined to be superseded by content-based multimedia systems. Users will thus no longer have to look for specific content. Instead, a central search function will locate it for them, either in the trunk of the car, where the DVD and CD changers are installed, or on the Internet via a wireless network (WLAN), or via mobile networks such as GSM or UMTS.
"In the future, we’ll have vehicles filling up with not only gasoline but also new data," continues Dr. Abdelkarim Belhoula, who works in preliminary development and is responsible for telematics and driving-assistance systems (see "The Road to Telematic Travel"). Stop at the gas station, and your car will automatically log on to the local server via a WLAN or similar system. And if you have a loyalty card, you may well be eligible for privileges such as a free music download. Likewise, radio hotspots located on the outskirts of a city or town might transmit the traffic news or "what’s on" information to the onboard infotainment systems of passing vehicles. What’s more, the same onboard technology could also be used to call home—provided, of course, that you’re the owner of a "smart home," like the T-Com House in Berlin (see "T-Com House"). In that case, your home will be able to contact your vehicle via mobile radio and transmit information on the status of various household systems, including heating, lighting and alarms.
Other web-based functions are also an option, including the so-called buddy-tracking system unveiled by Siemens at the beginning of the year. For instance, if you’ve somehow been separated from friends while on vacation, all you’ll need to do is call the buddy tracker, and a map display on your navigation system will show you their locations. The only errequirement is a mobile radio connection that’s always online—via UMTS, for example. Such a system should be ready for market launch in around five years.
Slippery When Wet. Belhoula’s vision of communicative vehicles also includes cars capable of exchanging the kind of information that will improve road safety. Earlier this year, at the closing presentation for the Invent project, which was sponsored by the German Federal Ministry of Education and Research, Siemens demonstrated how accidents can be prevented when vehicles are able to communicate safety-relevant data via radio. On a test track in Munich, a car was made to skid on a wet road. As it did so, it transmitted a warning, including its exact location, via GSM to the—in this case imaginary—vehicle behind it. As a result, the driver of that car would have had more time to brake or swerve. Should vehicles be permanently connected to the Internet, such communication could also take place directly, possibly via WLAN. However, WLAN technology was not originally developed for mobile applications and still encounters problems when it comes to moving vehicles. "We’re striving to create a mobile WLAN technology that can reliably transfer data between vehicles travelling at high speed," says Belhoula.
Future scenario. Tomorrow’s vehicles will network with one another and exchange information on traffic conditions and potential hazards
It’s also crucial to ensure that all this vehicle- to-vehicle communication doesn’t overburden or distract drivers. Designers and ergonomics experts at Siemens are therefore examining how drivers might best handle various information and display systems. In this connection, the consensus is that voice-activated controls will play an increasingly important role in tomorrow’s vehicles. In fact, some of today’s cars already feature onboard systems, such as air conditioning and hi-fi, that can be operated by simple voice commands, including the selection of a radio station or a favorite CD. Yet tomorrow’s voice-activated technology will be so advanced that it will even be able to deal with vague or incomplete requests, such as "Play the next song from Robbie Williams!"
In other words, we already have a clear road map to the virtual passenger of the future. At Siemens Corporate Technology (CT) in Munich, experts such as Gerhard Hoffmann and his team are busy working on the next generation of voice-recognition technology, a solution known as the Very Smart Recognizer (VSR). Lab-based computer systems of 20 years ago were already able to handle around 5,000 spoken words, although this required a large host computer. For onboard applications, the dictates of space, cost and durability initially meant the use of mini computers. Two or three years ago, these were capable of recognizing around 500 spoken words.
Since then, however, there has been substantial progress. "Although the processing power of an onboard computer is still only around 10 % that of a modern PC, our use of enhanced technology means that this type of system is now capable of understanding up to 75,000 words," says Hoffmann referring to the team’s current lab model. A mass-produced derivative of this, featuring a capacity of 30,000 words, should be ready for market launch by the end of 2006, although ultimately VSR technology will be capable of understanding 100,000 words or more. By the same token, the capacity for recognizing fluent speech is set to rise from today’s figure of 2,000 words to over 50,000—and that goes for a range of languages.
Keeping in touch. Wireless onboard systems will be able to communicate with household systems
Read My Lips. But today’s voice-recognition technology will have to become much more robust if it is to operate in a noisy environment such as that of a vehicle. In quiet conditions, Hoffmann’s VSR can recognize practically all the words in its vocabulary. Dealing with the automotive sound environment, which includes engine noise, tire and wind noise, is, however, a completely different proposition. If a window is open, even people can have problems understanding one another. Yet the VSR is already highly immune to background noise and performs better in this category than rival systems. That’s because it features smart noise-suppression techniques and an echo suppressor that filters out known sources of sound such as signals from the audio system. As a result, the navigation system can be operated even when the radio is on.
In their efforts to boost recognition rates, experts at Siemens Corporate Research (SCR) in Princeton, New Jersey, are using a range of advanced techniques to blank out background noise. For example, their use of socalled array microphones has substantially increased word recognition. Array microphones comprise two, four or eight small microphones installed, for example, in the rearview mirror. Whenever the driver speaks, the sound signal takes varying lengths of time to reach the individual microphones. On the basis of these different propagation times, the array microphone is able to locate the sound source—in other words, the driver’s mouth—and suppress all other signals.
In Munich, CT researchers are working on a new system that employs lip-reading technology. This involves the use of a small video camera that monitors the driver’s face. Using, among other things, a special color-classification technique to scan the video images for different skin tones, the system localizes the driver’s lips and then deciphers the speech content on the basis of their movement. "It’s still early days, but we’re progressing well," says Hoffmann. "This technology will further enhance the VSR’s performance in very noisy environments such as convertibles."
Voice-activated modules in tomorrow’s vehicles will not only need to be able to interpret commands, they’ll also have to have speech capability of their own. This is already the case with today’s navigation systems, which give drivers directions. In addition, the onboard infotainment systems of the future will also be able to read text messages, e-mail and text from the Internet so that drivers can access information while concentrating on the road.
Papageno, for example, is a highly compact voice synthesizer currently being developed by Siemens researchers. The software first transcribes written words into phonemes and then determines the type of sentence to be spoken—for example, a question or short statement. In the process, it identifies the correct speech rhythm (prosody) to be adopted. On the basis of this analysis, it searches out the right speech fragments from a sound catalog, modifies them and combines them into intelligible speech.
Using a video camera, software is able to combine a driver’s lip movements with spoken commands, thus increasing accuracy even in the noisiest environments
Dialog with Diane. "We’ve already developed such a system for extremely small platforms such as mobile phones," says Dr. Michael Lützeler, who is responsible for implementing Papageno at CT. Lützeler’s goal is to make voice quality more natural and therefore suitable for more demanding tasks. "We need to improve the quality of individual speech elements and of prosody," he says. Like his colleagues in voice recognition, Lützeler is counting on the extra processing power and memory capacity that onboard systems possess in comparison to mobile phones.
Genuinely natural conversations with a virtual passenger will become a reality only when voice recognition and voice synthesis are linked using dialog components. It will then be possible to ask complex questions in a completely normal way—"Where’s the fuel cap?" or "Direct me to the nearest cheap gas station"—and the system will be able to understand and answer many of them on the basis of information already on board or sourced from the Internet. Here, CT is pinning its hopes on the Diane dialog machine, which has already proved itself in applications for telephony, such as in answering systems.
And Siemens has even bigger plans for cars that can talk. For example, in conjunction with onboard systems, personalized voice recognition—as already used in certain phone applications—could prevent unauthorized use of the vehicle. Such systems could also be equipped with a module to recognize different languages. That way, rental cars would be able to switch automatically from one language to another, depending on the nationality of the driver. And, last but not least, speech styles could even be matched to the driver’s age—as determined by the sound of his or her voice—and thus switch from a laid-back to a more dignified way of talking. In other words, Krekel’s virtual passenger could almost be indistinguishable from the real thing.
Rolf Sterbak