The challenge of storing large amounts of data

Here is an article on the challenges of storing large amounts of data, by  (@steveranger) on July 1, 2015 in ZDNET’s special feature on The Evolution of Enterprise Storage.

My position?

<begin quote>

And as Florentin Albu CIO Rothamsted Research, said: “Storing data is relatively cheap once the foundation of storage and backup infrastructure is in place. The headache comes from the side of data management, ensuring the data can be retrieved with a high degree of relevance and – specifically for large data sets – that its accuracy and integrity is maintained during processing.”

<end quote>

Read the full article here:

http://www.zdnet.com/article/battling-the-giant-data-monster/

What’s your digital story saying about you?

digital_self_smallAre you in control of your digital self? Where do you stand on the issue of privacy in a digital society? Since you are likely reading this on a social network, do you understand the current privacy concerns?

Materials trying to answer these questions are usually known for curing insomnia, so notable exceptions are worth being brought to attention.

Al Jazeera had a refreshing approach to privacy, Big Data and Digital Society in a recent publication caled “Terms of Service”. To make it easier to follow the concepts, the authors Michael Keller and Josh Neufeld chose the format of a short comic book/novel. The approach worked on me: I simply could not put it down until reaching the end, and while I am not a fan of comic books, I considered it a rewarding 20 minutes read.

The authors write about privacy in the Digital Society, giving a useful short history of how the matter evolved to become a concern, and how this concern is accentuated by the Internet of Things and Big Data. The novel explains using Dan Geers’s framework of “Yesterday, Today and Tomorrow Questions” how we tend to go down the slippery slope of trading off privacy for features and perceived cost savings. It also brings in the views of Prof. Scott Peppet on how peer pressure leads to more people revealing their private information on social networks.

By connecting all the digital dots that a person leaves in cyberspace, a “story” is formed. The authors stress how important it is to understand who gets to tell the story. If the story is told by someone else than its subject, it often leaves that person in the position of having to defend the truth. This can be a difficult task, and an example is given of an employer judging how well the prospective employee would fit based on their social media profile. The novel points out that particularly younger generations instinctively try to control their image through an affluent digital presence that allows them to tell their version of their digital story first.
Trading immaterial personal information for immediate tangible benefits (e.g. discounts or use of products) gradually builds up a digital profile. The Digital Society by its nature offers unprecedented access to data and enables companies and governments to draw conclusions about individuals based on these profiles. The examples given in the book (e.g. linking one’s social network to their fitness data to produce a credit score) show that such conclusions are not necessarily intuitive for, nor under the control of, the respective individual. However this doesn’t stop these conclusions from significantly impacting the person’s lifestyle, financial position, etc.

The novel concludes by highlighting the constant trade-off that we make in a Digital Society between privacy and convenience – with the latter winning most of the time.
The foundation of our Digital Society is being built at full speed. This evolution brings increased benefits and – no doubt – increased complexity and risks. The rules of the new game need to strike the right balance between convenience and privacy, and in order to do so, we as a society need to have a broader understanding of these concepts. This is why the timing and format of Al Jazeera’s publication seems to be just right.

What is your position on privacy and digital society? Are social networks a necessary evil, or have you already put on the tinfoil hat? How successful are you in mastering your digital self?

You can read Al Jazeera’s comic novel “Terms of Service” here:

http://projects.aljazeera.com/2014/terms-of-service

 

Dawn of the data-centric computing

data-centric_computing In what seems to be a major step forward in solving big data problems and advancing research capabilities, IBM announced today that it was awarded by the U.S. Department of Energy a $325 million contract for building the world’s most advanced data-centric super-computer, by 2017.

The data-centric concept is explained by IBM as one which “moves much of the processing to the places where the data is stored, whether that’s within a single computing system, in a network of computers, or far away in sensors tracking the weather or monitoring an energy pipeline”. This is needed because, particularly in the context of big data, one of the main challenges is not the actual processing power, but the speed at which vast quantities of data can be transported to and from the processing facilities.

Until now, most hopes for revolutionizing big data processing were with quantum computing. Quantum computers are particularly suitable to tackle big data because of their ability to simultaneously process different approaches to the same problem. Progress in the area of quantum computing has been slower than expected, because among other issues, problems need to be defined in a whole new way in order to be solved using this technology.

For quantum computing, the Canadian company D-Wave leads the game with high profile clients such as Google, NASA and (reportedly) NSA

In the new contract with DoE, IBM brings the data-centric computing approach, and has joined efforts with NVidia (GPU and interconnectivity) and Mellanox Technologies (interconnectivity).

The level of investment, effort and risk situates these projects well out of reach for the majority of regular companies and research institutions. However, with more options developing, we are likely to see new methods and approaches for dealing with big data, which will will be gradually mainstreamed.

A few links:

– the concept of data-centric computing explained in (rather) simple terms:

http://www.engadget.com/2014/11/14/ibm-data-centric-computing/

– IBM’s data-centric design vision:

http://research.ibm.com/articles/datacentricdesign/

– a video (trying) to describe some quantum computing terms:

http://mashable.com/2013/10/13/google-quantum-computing-video/

Digital disruption on the farm – prescriptive planting

An interesting article from The Economist (24 May 2014) on the use of big data on the farm and the relationship with prescriptive planting – the “system that tells them [farmers] with great precision which seeds to plant and how to cultivate them in each patch of land”.
There is a reference to The Climate Corporation which was founded by two Google ex-employees, mapping US land info, soil details and weather data with the purpose of using the aggregated data for selling crop insurance. This company was later bought by Monsanto for about USD 1 billion – one of the largest takeovers of a data company. #bigdataagriculture

Read more about it here:

http://www.economist.com/news/business/21602757-managers-most-traditional-industries-distrust-promising-new-technology-digital