How do you analyze data?
Uszkoreit: By posing a wide variety of questions. They can be very precise, such as those we ask when we’re analyzing trends for consumer research. They can also be less precise, for example when the data fluctuates. Take the increase in drug prescriptions. Here we want to know what causes the increase. When you deal with vast amounts of data, you sometimes get answers to questions you didn’t even ask.
Where do you think the evaluation of data will generate the biggest economic or social benefits?
Uszkoreit: The main benefit is that we make not only structured data but also all types of unstructured data, such as texts, images, and voice recordings, usable. In addition, we can find correlations within the data. For example, we can compare weather station data with harvest information or traffic statistics, or nutritional facts with medical data. Society benefits from this knowledge of complex processes and interrelationships, no matter whether it’s in the healthcare sector or related to the economy. The objective is to find patterns and to create correct digital depictions of the real world. To enable computers to understand language, you also need large amounts of data and in-depth knowledge of the world.
What risks do you see?
Uszkoreit: There’s a potential risk whenever knowledge about individuals, groups or processes can be misused. For example, it could involve someone who uses medical information to blackmail patients or has detailed knowledge of a bank building’s architecture and gives it to gangsters.
What do you recommend?
Uszkoreit: You can compare this situation with the financial system. Although there is always a certain risk that money will lose its value, we haven’t banned it and returned to bartering. Instead, we have created systems for protecting our money. In the real world, we have formal legal concepts for regulating the handling of goods and money. This is something we still lack in the virtual world. However, failing to exploit the potential that data offers us would almost be as though physicists were to stop conducting research because it might lead to the creation of a dangerous weapon — even though they would then forgo the possibility of discovering a solution for humanity’s energy problems, for example.
What do you think the concrete benefits will be?
Uszkoreit: In the case of a jet turbine, for example, you can prevent a disaster from happening by evaluating sensor data so that you can detect problems early on before the turbine fails. You can do the same with bridges, as they don’t collapse by chance either. Such analyses are also beneficial in medicine; they enable doctors to detect diseases and their causes much earlier than would otherwise be the case.
Could the predictions be as far-reaching as those in the movie Minority Report, where crimes are prevented before they are even be committed?
Uszkoreit: It would certainly be possible to detect patterns that could, for example, tell us where and when crimes might occur, as well as under what conditions. However, you can’t analyze individual cases. It would be a mistake to think you could make accurate predictions regarding such an extremely complex system as a human being. A tiny cause can lead to major changes in behavior. I think that human behavior is subject to nonlinear processes, and thus also to chaos theory’s butterfly effect, in which the flapping of a butterfly’s wings can eventually trigger a storm far away.
In many cases, data becomes especially valuable when it can be assigned to specific individuals. What do you think about this development?
Uszkoreit: Although we depersonalize and anonymize the data, it’s also true that the data will nevertheless allow you to draw conclusions regarding specific individuals if it is very diverse and reflects people’s lifestyles, environments, and physical conditions. For example, geneticists can recognize every individual on the basis of his or her gene sequence. But how should one handle such information in view of the great medical benefits it provides? Another example is the use of surveillance cameras in subway stations. Although these cameras film you, they do it to improve public safety. Or imagine you live in a house that’s famous for its architecture and you forbid Google Streetview to display the building on the Internet. What right do you have to dictate that the building can only be viewed by a rich Australian who can afford to fly over here in order to take a look at your house, for example, but not by his less wealthy neighbor who would like to simply look at it on the Web?
Does that mean that informal self-determination has to be reconsidered in today’s age of the Internet, social media, and big data?
Uszkoreit: Yes. The question is where my rights to my data begin and where they end. To return to the previous example, is the view of a house a right that automatically comes with the property? Or, to give another example, do I own the image made of my broken leg? I neither took the X-ray picture nor did I pay for it. But such images might help to heal other people’s legs. Is it morally justifiable to prevent the image from being used to treat other people? I believe we have to rethink the whole matter. Does all data have to be owned by someone? The legal regulations concerning the Internet are indeed uncharted territory. When should we focus on informal self-determination and when on the common good? The courts still have to clarify a lot of issues here. However, we as individuals also have much more power nowadays than in the past.
What do you mean by that?
Uszkoreit: Customers can band together in forums, for example, and put dealers under a lot more pressure than in the past by boycotting certain products or production processes, for example. Consumers can also form purchasing cooperatives, sign petitions, and initiate referendums. The outrage disseminated on the Internet is an unpleasant demonstration of how influence can be exerted here. A major transformation is currently under way in society. Digital democracy is becoming more feasible and will eventually be introduced.
If you take a look into the future — say to 2050 — what do you think the world of big data will look like?
Uszkoreit: The virtual world will become more and more like the real one. We’ll live in it and with it. An example would be 3-D data rooms that we can actually enter. But, these systems will need security features like those in the real world. For example, not everyone is authorized to open a bank safe, and documents are kept locked away. This is done to protect individuals as well as society at large. However, because the real and the virtual worlds will coincide, nobody will be able to own the view of a house or an X-ray image — neither in the real world nor in the virtual one. But the biggest challenge we will face with big data is that of time. A limited amount of forgetting has to be possible in the future, but we don’t want a walled-up library either. And how will data remain continuously usable? Imagine that we had the equivalent of all of the data that is produced today for the past 4,000 years. That’s what things will be like 4,000 years from now. Historical research would be done very differently from the way it is today. But without further technological advances this flood of data might inundate us.