Big Data

The term Big Data has been emphasized recently but it was first used decades ago. Big Data refers to a technology phenomenon that has arisen since the mid-1980s. As computers have improved, growing storage and processing capacities have provided new and powerful ways to gain insight into the world by sifting through the infinite quantities of data available. Even tough in the last few years there were advances in Big Data applications, there is still a long way ahead. Big Data also play a big role on data privacy and tracking since its complex data sets requires a set of techniques and technologies with new forms of integration to reveal insights.

The Big Data Era
It’s known that the amount of data produced daily is getting bigger and bigger. Nowadays, around 2.5 quintillion bytes of data is produced daily. It is estimated that 90 percent of the sum total of data in the world today has been generated just since 2010. However, size is not everything. Today it is possible to store a huge amount of data in a microchip that has the size of a fingernail and process data in the speed of light. It is also a matter of types of data. It can be text, image, video, audio or even a like. The current main problem is that the majority of this data is unstructured. That means it’s not organized and put together in  a database. As Susan Etlinger discuss in the TED conference(2014), data don’t create meaning, people do.

How to deal with it
To handle large data sets in times gone-by enterprises used relational databases and warehouses from proprietary suppliers. However, these just can’t handle the volumes of data being produced. Nowadays, most of the companies don’t use super computers and big data warehouses to deal with Big Data. They use cloud computing and powerful mining algorithms that run in computers connected by a network. Large data-intensive companies, such as Amazon and Google, are taking the lead in some of the developments of the technologies to handle big data.

Storage
For storing all these data a new type of database have been used since the “old” databases have a lot of restrictions when storing unstructured data. They are know as NoSQL databases. They all achieve performance gains by doing away with some (or all) of the restrictions traditionally associated with conventional databases in exchange for scalability and distributed processing.

Analysis
For analyzing this data several techniques are used: machine learning, natural language processing, predictive modeling and neural networks. However, machine learning stands out. Machine learning algorithms use pattern recognition to produce computational learning. In a simple way, humans stop trying to teach the computer how to understand the given data. Instead of this, in machine learning, the data is given to the computer and the computer itself tries to interpret the data. Many advances in Big Data were produced because of machine learning.

Future with Big Data
Google, for example, has been using Big Data to develop driverless cars. The computer embedded in the car has to collect and understand data around him, by himself. For example, if the computer “sees” a red semaphore light, it has to understand that it can’t continue thus it has to stop. If all the cars were connected together they could share information and it would be possible to create an environment free of traffic accident. The car would collect data from other cars, it would predict an accident and it could prevent the accident before it happens. In the future, similar techniques could be used to recognize patterns in the human DNA and eventually find best medical treatments or even a cure to diseases like cancer, diabetes and Alzheimer.

The Bad Side
When Kenneth Cukier presented at TED Talks conference, he pointed two bad implications of the Big Data advance. The first one is related to ethics and privacy. Police, for example, may use location Big Data to determine where to send the patrol. However, they may start looking into  individuals’ data. Their high school transcript, the fact of they are unemployed or not, their credit score, their web-surfing behavior. How far should companies be allowed to use consumers’ information? What about the security of this information? These are questions that the society will have to deal with.

The second problem is that Big Data will steal jobs. Big Data algorithms will challenge high level professors. A lab technician, for example, who observes a cell to determine whether or not it is a cancer cell. He/She will be challenge by a pattern recognition algorithm that can quickly solve this problem. One has to use Big Data technology as a tool for human needs not as a substitute.

It is evident that the society can’t escape from the Big Data Era, and it’s not far. Many companies already take advantage of this technology. Netflix collects and analyze data from their consumer thus they can know what kind of TV show they should produce. Amazon does the same thus they can indicate products that their consumers would like. The Semantic Web or  the Web 3.0 is part of this new era. A type of Web where each person will have a complete different experience even though they are in the same webpage. The Big Data can be used for great advances as finding cure for diseases or stoping the global warming. Big Data as any technology is a tool but unless people are careful, it will burn us.