Big Data Security: How to Safeguard People's Data and Assess Your Data Protection Readiness
Sun, April 18, 2021

Big Data Security: How to Safeguard People's Data and Assess Your Data Protection Readiness

Big data grew by 61% (2012) and 60% (2013), which was the highest at the time, as found by German statistics platform Statista. In 2018, the big data market was at 20% and was expected to grow by 17% in 2019 / Photo by: nicoelnino via 123RF

 

When you think of big data, you might think of enormous data sets collected from various sources, wrote Christo Petrov on Techjury, a tech news website. However, due to their complexity and quantity, the data you gathered cannot be stored, collected, or processed using existing conventional tools. Apparently, most of the data aggregated by governments, companies, and individuals, are confidential to a certain extent, explained Gordon Haff of The Enterprisers Project, a CIO community-powered resource.

For example, the annual number of reported cases of a disease is a matter of public record, enabling key stakeholders to understand whether some program is progressing or if corrective measures are needed. Generally speaking, the names of those who are suffering from the disease are protected by law. As big data sets become a part of various use cases, protecting user privacy becomes an even more pressing concern.

Big Data Statistics

Big data grew by 61% (2012) and 60% (2013), which was the highest at the time, as found by German statistics platform Statista. In 2018, the big data market was at 20% and was expected to grow by 17% in 2019. The growth of the market will continue to decrease, plateauing at 7% from 2025 to 2027. 

It is estimated that we will generate about 40 zettabytes of data by 2020, according to Data Never Sleeps 5.0, via software company DOMO. That means we generate 2.5 quintillion bytes of data each day. On the other hand, 97.2% of organizations are investing in big data and AI, according to a survey conducted by New Vantage Partners, a big data and business strategy consulting firm.

New Vantage Partners gathered respondents from approximately 60 Fortune 1,000 companies, including American Express, Motorola, NASDAQ, etc. 62.5% of the executives surveyed by the company said their organization appointed a Chief Data Officer (CD), clearly indicating a 12% increase since 2012. 

Further, a record number of enterprises have invested in big data and AI initiatives at a staggering 97.2%. 60.3% invested below $50 million and 27% said their organizations’ cumulative investments in big data and AI fall between $50 million and $550 million. The study also found that only 12.7% of respondents said their companies invested over $500 million in big data and AI initiatives.

 4 Approaches On How Not to Compromise Data Privacy 

 1.     Anonymization

For example, you could remove the patient’s name and other sensitive information if you’re sharing medical images to compare and gauge the effectiveness of different diagnostic techniques. More often than not, such data is actually pseudonymized by a trusted party, replacing personally identifiable fields with artificial identifiers. 

However, one of the problems with anonymizations is that it’s not always crystal clear “what can be used to identify someone and what can’t,” especially when you’re correlating with other data sources, public ones included.

 2.     Differential Privacy

Traditional anonymization methods are often not understood how well they are safeguarding privacy. But with differential privacy, the algorithm inputs random data into a data set to  safeguard individual privacy, albeit in a “mathematically rigorous way.” Since the data is “fuzzed,” the results may not be as accurate as the raw data, though it depends on the technique. Overall, differential privacy remains an area of active research.

 3.     Fully Homomorphic Encryption

This lets a third party undertake complicated processing of data without them seeing the data itself. Unfortunately, fully homomorphic encryption is costly (in a computational sense) and is not practical yet. However, if its capabilities are harnessed, then it would add an extra layer of protection against data leaks when you’re using public cloud or other service providers to analyze data sets.    

 4.     Secure Multi-Party Computing (MPC)

It replaces a trusted third party with a protocol such as preserving certain security properties (e.g. privacy and correctness) even if some of the parties involved collude or compromise the protocol. One use case of MPC is when companies are willing to share data with the government or other organizations to use as long as no one else can see that data.

Further, a record number of enterprises have invested in big data and AI initiatives at a staggering 97.2%. 60.3% invested below $50 million and 27% said their organizations’ cumulative investments in big data and AI fall between $50 million and $550 million / Photo by: everythingpossible via 123RF

 

 4 Tips to Assess Your Company’s Privacy and Data Protection Readiness

 1.     Never Forget the Real Data Owner

You can be easily distracted by terms like “big data” and “data leaks,” thinking of data as a faceless pool of information controlled by large corporations and specialist brokers, wrote professional services network Crowe Global, via business news platform Forbes. But you should not forget that the real owner is the individual. Each data you collect represents a real human being. Therefore, people can adjust more easily to new protections and regulations if your privacy framework begins with the individual in mind.

 2.     Be Simple

Privacy policies will become more robust and resilient to change if they have fewer exclusions, corner cases, and exemptions. For example, a company decided to extend GDPR and CCPA practices to all people, regardless of their citizenship or location. This might be a costlier approach, but in reality, it’s simpler and more ethical to implement, said Pamela Hrubey, managing director in the consulting group at Crowe.

 3.     Merge Privacy and Data Security Practices

How tightly linked are privacy and data security practices? That is a subject of debate. But organizations that manage them separately will struggle to stay resilient when changes are abound. Presently, the reputational and financial costs of breaches in both avenues are an increasing concern among senior leaders. 

Hrubey commented, “You need to have a focus on both privacy and information security if you really want to protect both the organization and the individual data owners.”

 4.     Think of Your Brand’s Approach to Privacy Ethics

A shift in mindset or terminology can have an impact on how you approach privacy and security. Hrubey said if you think that doing the right thing is an essential part of your brand, or if you believe “there is a foundational value in being compliant, then that’s your starting point.”

 A brand will be more resilient to change if it fully accounts for the reputational costs of privacy violation, learns from its mistakes, and adapts to ever-evolving tastes. One top pharmaceutical has information about its privacy program, which contains data about issues it has investigated, as well as how it adjusted to those vulnerabilities, Hrubey mentioned. “And that transparency builds trust,” she added.   

There are techniques to safeguard data, but businesses should remember that the real owner of the data they collect is the individual. Brands should be transparent with their privacy programs and policies to build trust among customers. Without transparency, businesses are digging their own graves.