Can the manipulation of big data change the way the world thinks?

Jonathan Gornall writes that the companies that control our data will soon be able to shape our very intellectual foundations.

Andy Jassy, the chief executive of Amazon Web Services, which is storing huge amounts of data in the cloud.     Mike Blake / Reuters
Powered by automated translation

At first, even the internet experts gathered in Las Vegas for the world’s biggest cloud-computing event thought it was nothing more than a huge joke when Andy Jassy, chief executive of Amazon Web Services (AWS), presented his company’s latest solution for managing the world’s increasingly unmanageable torrent of digital data.

But the 14-metre lorry-hauled “ruggedised” shipping container that rumbled on stage at The Venetian Hotel on November 30 was no joke. Rather, it was an admission that the arteries of the internet are close to being clogged.

The AWS Snowmobile is for companies that have accumulated so much data that it would take decades to upload it to the cloud. The container, a vast hard-drive on wheels, is driven to a company’s data centre and connected by cable. It sucks up the data and, accompanied by guards, delivers it to one of Amazon’s many cloud centres around the world.

In essence, Amazon has resorted to a clunky analogue solution to a growing digital problem. It will come and collect your network-choking digital waste in a large metal bucket.

Mr Jassy told his Vegas audience that when Amazon launched its AWS cloud division in 2006, “the notion of an exabyte of data was completely out there”. Today, “you would not believe how many companies have an exabyte of data they want to move to the cloud”.

An exabyte is a billion billion bytes or the equivalent to all audio, printed and video material held by the United States Library of Congress, 500 times over.

The problem is that uploading this much data to the cloud via even a fast network connection would take about 25 years. Using 10 Snowmobiles, each capable of holding 100 petabytes of data (one tenth of an exabyte), it would take “only” six months, said Mr Jassy.

The Snowmobile is evidence that our commercially driven obsession with digitising every aspect of our existence is out of control. As of October 2016, internet archive The Wayback Machine, a depository of more than 279 billion web pages, held “only” 15 petabytes of data. That there are now companies out there holding data equivalent to 66 copies of the entire internet is mind-boggling.

Like the actual universe following the Big Bang, the digital universe, of which the internet is but a fraction, is expanding rapidly. It is doubling in size every two years. According to market research company International Data Corporation (IDC), in 2013 that universe consisted of 4.4 zettabytes of data (one zettabyte being a thousand exabytes, equivalent to streaming high-definition video for 36,000 years). Over the next five years it will expand to 44 zettabytes (binge-watching Netflix for more than 1.5 million years).

If all the world’s data in 2013 were stored on 128GB iPad Airs stacked one on top of the other, says IDC, the pile would extend two-thirds of the way to the Moon. By 2020, there will be more than six such piles, each reaching all the way to the Moon.

Companies are amassing ever greater volumes of data, hoarding and analysing every trace of our digital lives, the better to sell us products and control our thoughts and behaviours. We are, of course, willing parties to this extraordinary manipulation of our existence.

Much of this data is, after all, generated by us, from using mobile phones, sending emails, posting on Facebook and tweeting, to online banking and shopping, watching YouTube videos or Netflix films and blithely, if uncomprehendingly, uploading every photograph we take to the cloud. (A neat trick. We now pay to access pictures we once happily held on our own hard drives.)

Our digital incontinence is only going to get worse. This is the year we embrace en masse the “internet of things”, the largely pointless but commercially lucrative connectivity of everything from kettles and fridges to home lighting and security systems, not to mention the soon-to-be ubiquitous “conversational interfaces” of Amazon Echo and Google Home.

Is there a benefit to this extraordinary revolution? Certainly, for the companies driving it. According to IDC, the value of the global “big data and business analytics market” will grow from $130 billion (Dh477 bn) at the end of 2016 to $203 billion by 2020.

But what is this unprecedented accretion of intelligence going to do for those of us without shares in Amazon, Google, Apple and the rest? Big data is, after all, in the hands of big corporations and it is already clear that the main purpose of mining and analysing it is to better part us from our money – the Google ads that seem to know your mind are just the start.

That’s fine. The world has always revolved around the relationships between makers, sellers and buyers of stuff. But big data is different. It gives those who control it the ability to shape every aspect of our intellectual and cultural outlooks.

By default, the manipulation of big data is about parting us from our imaginations and the mind-broadening benefits of serendipity. Thumb through a newspaper, for example, and you might well stumble upon something that challenges your preconceptions and makes you think differently about the world and your place in it.

Rely on algorithms for your news and views, and you will be fed a steady diet of your own cultural and political prejudices, in a spiral of narrowing perspective.

We should not, in other words, embrace the wholesale digitisation of human existence as the unalloyed good thing that the purveyors of the Amazon Echo and other toys would have us believe it to be.

IBM likes to say that 90 per cent of all the data in the world today, including all the books, art and films and so on generated in the previous history of the human race, has been created in the past two years alone.

Think about that – all the countless works of art you could never possibly see, the tens of millions of books you will never read, the music you will never hear. Now imagine, in digital form, 10 times everything that has ever been created by human hand being generated in just the past two years, and then wonder about the value of this vast data-berg we are creating.

Before the digital era, the record of human existence was curatable, as witnessed by the world’s great libraries and museums. For millennia, human beings have left records of their existence for the benefit and wonder of future generations, from the 18,000-year-old Stone Age cave paintings in south-west France and the surviving fragments of cuneiform script of ancient Mesopotamia to the art and literature of modern times.

Yet such traces of humanity have, by definition, always been selective. Not everyone could write or draw and not very aspect of every person’s life was considered worthy of archiving. Now, however, it is clear that to future historians we are bequeathing not a priceless treasure trove that will allow them to make perfect sense of our time, so much as a vast digital landfill problem, a databank of trivia beyond curating.

This will be the year that big data comes of age, aided and abetted by our embrace of the burgeoning internet of things and, through the vanity-serving medium of social media, our seemingly inexhaustible fascination with the tedious minutiae of our generally not very fascinating lives.

If we aren’t more selective about what we commit by the truckload to the digital universe, historians may one day look back on 2017 as the year we stuck our heads in the cloud and lost both our imaginations and our minds.

Jonathan Gornall is a regular contributor to The National