Performing analytics on datasets can reveal confidential information about organizations or individuals. Even analyzing separate datasets that contain seemingly benign data can reveal private information when the datasets are analyzed jointly. This can lead to intentional or inadvertent breaches of privacy.
Addressing these privacy concerns requires an understanding of the nature of data being accumulated and relevant data privacy regulations, as well as special techniques for data tagging and anonymization. For example, telemetry data, such as a car’s GPS log or smart meter data readings, collected over an extended period of time can reveal an individual’s location and behavior, as shown in Figure 3.1.
Figure 3.1 Information gathered from running analytics on image files, relational data and textual data is used to create John’s profile.