The new path to privacy tech through data infrastructure


By Xavier Quintuna | February 17, 2021

Editor’s note: Data infrastructure and privacy tech will be the focus of the upcoming “Data and privacy: The new rules of the game” virtual event that Orange Silicon Valley will host on March 3. See the official event page for information on how to register.

Data infrastructure has been part of the enterprise since at least the 1980s, thanks to data warehousing, which opened up a $20 billion market. Data warehousing offered new ways to transform raw data from an unstructured format into a structured format. The practice exposed a generic interface (Structured Query Language, or SQL) to business intelligence teams to explore the status of ongoing business over time. This allowed enterprises to gain a systematic, time-based perspective of their business performance – financially and operationally.

Business intelligence (BI) with a market size of $23.1 billion is an important component for exposing the current state of a business. However, within the last 10 years, an explosion of business innovation resulted in the need to capture more diverse data assets associated with non-traditional sources, such as video, audio, and photos, which do not fit into the traditional data warehouse model. In addition, the cost of running proprietary data warehouse hardware on-premises became very expensive to scale up and down. To answer those challenges, the data lake architecture model was born.

Data lake offerings

Data lake offerings, which now have a market size of $20 billion, emerged as a possible solution, providing a centralized low-cost storage solution with computation capabilities. This data infrastructure model came with the on-premises resources to store any data source in its natural state and support different computational paradigms, such as MapReduce, which can implement distributed algorithms and a SQL-like interface for BI. However, data warehousing and data lake produced similar issues: It is difficult and expensive to scale up on-prem. As a result, the cloud became an alternative solution to decrease the cost of storing data.

Businesses are now asking future-facing questions using machine learning and AI. Meanwhile, companies like Facebook and Google are making large profits of around $270 billion from data generated by their customers. Customer data became an asset to the industry, enabling the creation of personalized services, including advertisements and recommendations. However, customers remained largely unaware of what was happening until privacy groups and the European Union started to observe and raise concerns over data usage without user consent.

GDPR and CCPA pushed Apple (among other companies) to rethink how data was being used in its business. The company notably began using privacy as a way to gain customers’ trust, modeling itself as a data privacy gatekeeper in its marketing. In the meantime, customer discomfort began pushing companies, including Facebook and Google, to rethink their business models, as well as how they use data with privacy as an expressed value.

As time went on, privacy regulations became a challenge in the wake of growing ML and AI deployments. The cost of running data lakes on-premises created incentives to adopt cloud-based infrastructures. However, now there is a new challenge, which is how to utilize customer-generated data while complying with regulations and without compromising privacy. For traditional enterprises, new regulatory and legal roadblocks arose, demanding that any project associated with data now needs to be evaluated by legal and risk teams. This approval process can take from a week up to a month or more, reducing the productivity of teams. In response to this challenge, the industry created a new model called privacy tech.

Privacy tech

Privacy tech is a compound of technologies that extend across data lakes and data warehousing on-premises with a public cloud. The model orchestrates the movement between on-premises repositories of data to the cloud, supporting and validating all the legal and risk requirements to implement companies’ own privacy rules. This movement of data across different storage solutions creates another challenge in keeping track of the type and location of the data at large scale. This challenge is inducing the rethinking of what data governance should entail. Data record tracking using metadata becomes a possible solution to address this problem. Centralized metadata repositories maintain records associated with locations, status, stage, and type of different data assets in public and private clouds. The metadata becomes an important aspect of achieving privacy. Traceability and transparency could help legal teams to respond to any request for regulators and improve productivity for their data teams.

Privacy tech is becoming a main focus for IT and at the C-level of companies. Geographic Scope and Forecast predicts it will reach a $100 billion market by 2027. The VC community has already committed more than $8 billion in investments with an aggregated evaluation of $35 billion. Privacy tech could represent the consolidation and maturation of enterprise operational data analysis in a manner that ensures a trusted relationship with its customers. It also is an emerging opportunity for new technology and innovation.

As with any technology breaking point, it is important to analyze the privacy tech return of investment (ROI). Based on the Cisco 2021 Data Privacy Benchmark Study, it points to six areas to benefit: reducing sales delays, mitigation losses from data breaches, enabling innovation, achieving operational efficiency, building trust with customers, and making their company more attractive. Those areas show the impact of data in daily business operations, which are important to consider for current and future strategies. Privacy tech is not only a technology stack, it is also a company cultural mindset. Apple is one of the leaders in the consumer area to create awareness regarding the misuse of data from social network companies and other industries. Orange has signed an agreement to become the security vault for its customers’ data, making it one of the leaders in protecting data in the enterprise arena. Enterprises like Orange have started to invest in technology and innovations to maximize the privacy of their customers and support their goals around AI and data. Privacy tech could be a baseline to support multiple AI and ML innovations that will improve services and allow for personalization that — taken together — will benefits customers.

Orange Silicon Valley is proud to be hosting the upcoming virtual event “Data and privacy: The new rules of the game” on March 3. See the official event page to register.