**Post by Jose Carrillo**
2/19/13: Software That Tracks People on Social Media Created by Defense Firm
Today, the internet has become an ever increasing part in our daily lives. It encroaches on everything from our social lives, to our habits. We are becoming addicts of “sharing” every moment and action we do (whether purposefully on Facebook, or unknowingly on Google). Every action on our smartphones, computers, and websites creates countless amounts of information that companies can “mine” in order to profit. While some of this data is extremely helpful in order to present people with news stories they are more interested in, make games more fun, make the internet faster, answer questions better, and display more relevant advertising, the shear amount of things you can learn about a person is unbelievable. The scariest thing of all is that most of us are not aware of how much information is really out there. All the way down to our habits, history, and location. Even more, there are currently few if any government controls on how this data can be used.
Recently, there have been a lot of developments in the field of computer science allowing engineers to create programs like RIOT, Raytheon’s data mining software that helps predict the movement of individuals based on their social networking. These programs analyze data that most people do not realize they have shared; data like IP addresses, exif data embedded in a picture (can include sensitive information like location), comments, check-ins, pictures, etc. All these things we post online contain all kinds of information from what we look like, where we are, what we like, who we know and socialize with, and what we are likely to do.
The creep factor comes in when you realize there are no limits to what this data can be used for, who can collect it, who can view it, and who controls it. Very few countries, if any, have policies in place to protect people from these kinds of privacy breaches. This makes me question whether it is up to me as a software engineer to draw the line on what types of data I collect and how it is used.
This privacy problem is not just related to spying and defense contractors either. It is an integral part of the development of the “cloud” going into the future. Companies like Google, Apple, and Facebook face the same types of ethical questions. Take Google for instance, they collect all kinds of data from every search you do and you do not even know about it. Everything from IP addresses, cookies, accounts, search terms, location, browser type, and computer and connect it to their ever growing network of data to build a profile on you so that they can serve you quicker, more relevant answers to your searches. But is this kind of data collection and privacy intrusion justified by faster, better, more relevant results? I do not believe current efforts at making users aware of exactly what and how data is used are enough. People are not in control of their identity online and this is a problem. For instance on Facebook, you have to go through some insane process to deactivate your account and even then they refuse to delete your profile unless you write an official letter. Unfortunately, this is one area where it is mostly up to policy makers to create laws forcing companies to be more clear and transparent about their data use. We need to escape the terms of service no one ever bothers to read.
Companies can and do mine all kinds of data from the web and can learn a lot about me from my likes, friends, hobbies, payment forms, purchases, location, house, habits, etc. I am fearful of the future thinking about the amount of data collected and what could be done with it if used for the wrong purposes or if someone got access to it. Facebook on the other hand, thrives from people sharing more and more information online. They want to become the hub of the internet connecting websites, people, services, apps, and events. Other services and websites even willingly create tie-ins to Facebook all in the name of increasing traffic (Spotify, Netflix, newspapers). They then use all this data to create more targeted advertising (a harmless end use in itself). But does Facebook need to consider its business practices in terms of such software as RIOT. Anyone can scrape Facebook for information on anyone else.
This raises the ethical questions of whether it is right to collect this data in the name of advancing technological innovation. Whether it is right to use this data if people are putting it out there willingly? Is it my responsibility to educate people on the data they leave around just by browsing the web? Should I as an engineer, draw the line and stop such data mining? Or is it justified just because people willingly post such data on websites like Facebook? Is Facebook responsible for what others do with the data it collects? If I were a Facebook engineer, do I have to consider these types of questions?
We are entering the age of the “cloud” and it is bringing about some of the most amazing technological innovations seen to date. Innovations like IBM’s Watson that can leverage all this knowledge online and one day help diagnose patients quicker. But is this enough justification for the collection and analysis of data? I have always been someone for innovation in the name of helping make the world a better place. As a computer scientist, I have always wanted more data because of the things you can accomplish with it; but now I am starting to realize the importance of a user’s autonomy online. Especially since the internet is essentially its own living ecosystem, independent of any one country. It is important to allow a person to control their information and how it is used in order to preserve the freedom that the internet has come to represent. It is one of the greatest achievements of this generation and I would not like to see it ruined.
This discussion brings about interesting viewpoints when considered from the perspective of the ethical theories discussed in class. For instance, from a utilitarian view point the internet is the greatest thing ever made. No issue of privacy can ever negate the overwhelming benefits and convenience of the internet. From a deontological viewpoint, the Internet is meant for all the right things but also has the responsibility of making it clear what data is being used in return for its services. These positive viewpoints are in line with how most people feel about the internet and at the same time have led to an overwhelming trust of it, a trust many companies now exploit. I believe that the internet has an unimaginable ability to do good in the world, but because of this very ability to provide us with services we are desperately addicted to, and its unregulated nature, it has created an environment full of promise and lacking any control. Anything is possible on the internet and that is not always a good thing (it reminds me of the iconic phrase “Who watches the watchmen?”). Who is to blame when it goes bad? And how do I as an engineer face design in an environment where I am the only one regulating anything?
It is especially hard to deal with these issues when people have become the product (most people do not realize that they are the product on the internet, their information, not the services they use). I as an engineer have a responsibility and duty to my company, but also a competing responsibility to the people. Moving in a direction that is beneficial to both groups is hard when their interest generally do not align (like Facebook wanting openness, and people wanting stricter privacy settings. It competes with Facebook’s need to make money).
Google’s mission is to index all knowledge and to one day answer any question instead of just showing you where an answer may be. Facebook’s mission is to connect everyone in the world and create an unprecedented human network of relationships, likes, loves, and personalities. The government’s mission is to protect its citizens (Every government shares this duty to its people regardless of whether it is a dictatorship, democracy, or communist).But at what cost? How can I as an engineer decide whether such techniques like data mining are ethical just because the end goal is good? How do I know where to draw the line? What should be my mission when it comes to privacy in the “cloud”?