Difference between revisions of "Differential Privacy"

From MgmtWiki
Jump to: navigation, search
(Context)
(Problems)
(One intermediate revision by the same user not shown)
Line 6: Line 6:
  
 
==Problems==
 
==Problems==
To much noise can render the data unless. One analysis showed that a differentially private verson of the 2010 census included households that supposedly had 90 people.<ref>Angel Chen, ''Differential Privacy.'' (2020-03) Technology Review p 27.</ref>
+
To much noise can render the data unless. One analysis showed that a differentially private verson of the 2010 census included households that supposedly had 90 people.<ref>Angel Chen, ''Differential Privacy.'' (2020-03) Technology Review p 27.</ref> It seems that "normal statistics" biases in the randomization would hide any data that was not "normal".
  
 
==Solutions==
 
==Solutions==
 
U.S. Census Bureau officials said the agency is revamping its systems to prevent anyone from using published data to target individual respondents through the information they disclosed to the census. The bureau aims to use a mathematical process, called [[Differential Privacy]], to modify census results sufficiently to reliably conceal respondents' identity. The agency will make small additions to and subtractions from each number, prior to almost every table's publication, and significantly cut the number of published statistics. Although data users are concerned these changes will disrupt their use of census data, not addressing the danger could allow information on individuals to be exposed, violating federal privacy law and elevating the risk of identity theft and other kinds of misuse.<ref>Paul Overberg, ''Census Overhaul Seeks to Avoid Outing Individual Respondent Data'' The Wall Street Journal (2019-11-10)</ref>
 
U.S. Census Bureau officials said the agency is revamping its systems to prevent anyone from using published data to target individual respondents through the information they disclosed to the census. The bureau aims to use a mathematical process, called [[Differential Privacy]], to modify census results sufficiently to reliably conceal respondents' identity. The agency will make small additions to and subtractions from each number, prior to almost every table's publication, and significantly cut the number of published statistics. Although data users are concerned these changes will disrupt their use of census data, not addressing the danger could allow information on individuals to be exposed, violating federal privacy law and elevating the risk of identity theft and other kinds of misuse.<ref>Paul Overberg, ''Census Overhaul Seeks to Avoid Outing Individual Respondent Data'' The Wall Street Journal (2019-11-10)</ref>
 +
 +
from ACM (2021-01-11)
 +
 +
As privacy violations have become rampant, and calls for better measures to protect sensitive, personally identifiable information have primarily resulted in bureaucratic policies satisfying almost no one, differential privacy is emerging as a potential solution.
 +
 +
In [https://queue.acm.org/detail.cfm?id=3439229 "Differential Privacy: The Pursuit of Protections by Default,"] a Case Study in ACM Queue, Google’s Damien Desfontaines and Miguel Guevara reflect with Jim Waldo and Terry Coatta on the engineering challenges that lie ahead for differential privacy, as well as what remains to be done to achieve their ultimate goal of providing privacy protection by default.
 +
 +
Differential privacy, an approach based on a mathematically rigorous definition of privacy that allows formalization and proof of the guarantees against re-identification offered by a system, signifies measures of privacy that can be quantified and reasoned about—and then used to apply suitable privacy protections.
 +
 +
In September 2019, Google released an open source version of the differential privacy library, making the capability generally available. To date, differential privacy has been adopted by the US Census Bureau, along with a number of technology companies.
  
 
==References==
 
==References==

Revision as of 11:24, 12 January 2021

Full Title or Meme

Differential Privacy is a system for publicly sharing information about a data collection by describing the patterns of groups within the collection while withholding information that could identify the Subject.

Context

Differential Privacy is a mathematical technique that injects inaccuracies or "noise," into the data, making some people younger and an equal number older, changing races or other attributes. The more noise you inject, the harder deanonymization becomes. Apple and Facebook started using this technique in 2019 to collect aggregate data without identifing particular users.

Problems

To much noise can render the data unless. One analysis showed that a differentially private verson of the 2010 census included households that supposedly had 90 people.[1] It seems that "normal statistics" biases in the randomization would hide any data that was not "normal".

Solutions

U.S. Census Bureau officials said the agency is revamping its systems to prevent anyone from using published data to target individual respondents through the information they disclosed to the census. The bureau aims to use a mathematical process, called Differential Privacy, to modify census results sufficiently to reliably conceal respondents' identity. The agency will make small additions to and subtractions from each number, prior to almost every table's publication, and significantly cut the number of published statistics. Although data users are concerned these changes will disrupt their use of census data, not addressing the danger could allow information on individuals to be exposed, violating federal privacy law and elevating the risk of identity theft and other kinds of misuse.[2]

from ACM (2021-01-11)

As privacy violations have become rampant, and calls for better measures to protect sensitive, personally identifiable information have primarily resulted in bureaucratic policies satisfying almost no one, differential privacy is emerging as a potential solution.

In "Differential Privacy: The Pursuit of Protections by Default," a Case Study in ACM Queue, Google’s Damien Desfontaines and Miguel Guevara reflect with Jim Waldo and Terry Coatta on the engineering challenges that lie ahead for differential privacy, as well as what remains to be done to achieve their ultimate goal of providing privacy protection by default.

Differential privacy, an approach based on a mathematically rigorous definition of privacy that allows formalization and proof of the guarantees against re-identification offered by a system, signifies measures of privacy that can be quantified and reasoned about—and then used to apply suitable privacy protections.

In September 2019, Google released an open source version of the differential privacy library, making the capability generally available. To date, differential privacy has been adopted by the US Census Bureau, along with a number of technology companies.

References

  1. Angel Chen, Differential Privacy. (2020-03) Technology Review p 27.
  2. Paul Overberg, Census Overhaul Seeks to Avoid Outing Individual Respondent Data The Wall Street Journal (2019-11-10)