What does this election tell about big data
By Sharad Varshney,
Posted November 9, 2016
Old Prediction Models
Frenzied media and despairing people are seeing this as the biggest upset ever. More than an upset this came as a surprise to many. The media is ‘upset’ because they were not able to predict this earlier. The reason is simple – the old conventional statistical models do work anymore. The flaw was in their approach which was not advanced enough to predict the outcome. In these models questions are asked to people and based on their answers, a prediction is made. Their shortcoming is that they don’t incorporate enough variables. That is why the prediction went wrong. Now we need to predict with big data.
Now theories are coming up that data analysis does not work in election forecasts. But numbers don’t give you wrong answers; only we didn’t add up all the numbers. So now we need big data approach for the prediction of election results. All news organizations are spending money on data visualization but are not doing sufficient efforts to embrace big data. Big data will infuse new emerging data sources into their election forecast models.
We need to focus on what these old prediction model is lacking and how to put these variables also into the calculation:
The rise of Social Media
Back in old days, people used to rely on media to form an opinion on various socio-political issues. Now because of the rise of social media, people are also developing their opinion based on partial facts and conspiracy theories. There was so much data on social media about this presidential election which could have been crucial in predicting its outcome but this data was never analyzed. This data could have given an insight into how people were thinking and how agitated they were. New prediction models need to understand how to put social media data in their prediction models.
Why People Vote?
This election made it clear that black population did not vote as much as they did for Barack Obama, on the other hand, the white working class voted more. While surveys and prediction model take racial population into account but do not take into account ‘why people vote’ variable in their prediction model. The data from social media, phone records can give a very good idea about this.
Its high time to change the old election forecast models and include big data technology in them. Most of the decisions taken by humans are based on emotions. Data scientists need to find a way to measure human emotions and include that in their prediction models.
Hopefully, in 2020, they (media) would use advanced Big data technology to take into account human emotions in their analysis.