Recall that reading reports must be submitted on Angel before noon (12
p.m.) the day of the lecture.
| Date | Syllabus | Reading
|
|---|
| Aug. 30 | Introduction, Overview of SDL techniques
(slides) |
Recommended background: L. Wasserman, All of
Statistics, Springer Texts in Statistics.
|
| Sep. 6 | No lecture. |
| Sep.13
| Overview of Techniques in CS literature, begin discussion of definitions (slides)
|
| Sep. 20 | Statistics background: log-linear models and
contingency tables (STAT 504 page).
Crypto background: definitional
approach, semantic security. (board lecture, no notes)
| Required reading:
- Executive Summary from James Waldo, Herbert S. Lin, and
Lynette I. Millett, editors. Engaging
Privacy and Information Technology in a Digital Age,
National Acdemies Press, 2007, 452 pp.
- S. Fienberg. Contingency Tables and Log-Linear
Models, Journal of the American Statistical Association,
Vol. 95, No. 450. (Jun., 2000), pp. 643-647.
- Y. Lindell. Secure
Multiparty Computation for Privacy Preserving Data
Mining. Unpublished essay (available from
http://www.cs.biu.ac.il/~lindell/research-statements/mpc-ppdm.htm).
Additional Reading:
|
| Sep. 27 |
Crypto: SFE to differential privacy (slides)
| Required:
Additional Reading:
- A. Dobra, S. E. Fienberg, A. Rinaldo and
Y. Zhou. "Confidentiality Protection and Utility for
Contingency Table Data: Algorithms and Links to
Statistical Theory", Manuscript, 2007. Available from
the Angel page for this class.
|
| Oct. 4 | Statistics: methods for tabular data and
the role of algebraic statistics (slides
(pdf))
Crypto: Achieving differential privacy (slides
(pdf))
| Required:
- A. Blum, C. Dwork, F. McSherry, and K. Nissim, Practical
Privacy: The SuLQ Framework, Principles of Database
Systems, 2005.
- Cynthia Dwork, Frank McSherry, Kobbi
Nissim, Adam Smith: Calibrating
Noise to Sensitivity in Private Data
Analysis. Theory of Cryptography Conference (TCC)
2006, p. 265-284.
- Dobra, A. and Sullivant, S. (2004). A
divide-and-conquer algorithm for generating Markov bases
of multi-way tables. Computational Statistics, 19,
347-366.
|
| Oct. 11-Dec. 6: Student lectures, see topics below.
|
| Dec. 13 | Sofya Raskhodnikova: Smooth sensitivity, sample-and-aggregate
|
Required:
|
| Opt/req. | Topic | Reading | Date | Student Presenting |
| Required | k-anonymity and other cluster-based methods
[slides (ppt), (pdf)]
|
Required:
Additional Reading:
- David Martin, Daniel Kifer, Ashwin Machanavajjhala,
Johannes Gehrke, and Joseph Halpern. Worst-Case
Background Knowledge. In Proceedings of the 23rd
International Conference on Data Engineering (ICDE
2007). Istanbul, Turkey, April 2007.
- Gagan Aggarwal, Tomas Feder, Krishnaram Kenthapadi,
Samir Khuller, Rina Panigrahy, Dilys Thomas, An Zhu. Achieving
anonymity via clustering. In Principles of
Database Systems (PODS) 2006, p. 153-162.
| Oct. 11 | Ge Ruan
|
| Required | Synthetic Data Sets [slides (.pdf)] | Required:
- Reiter, J. P. and Raghunathan, T. E. The
multiple adaptations of multiple imputation, Journal
of the American Statistical Association, 2007.
- Reiter, J. P. Releasing
multiply-imputed, synthetic public use microdata: An
illustration and empirical study. Journal of the
Royal Statistical Society, Series A, 168, pp. 185 - 205,
2005.
Additional Reading:
- Raghunathan, T. E., Reiter, J. P., and Rubin,
D. B. Multiple imputation for statistical disclosure
limitation. Journal of Official Statistics, 19,
pp. 1-16, 2003.
- Cornell Virtual
Research Data Center
- J.M. Abowed & S. Woodcock (2004). Multiple-Imputing Confidential Characteristics and File Links in Longitudinal Linked Data. Privacy in Statistical Databases. pp.290-297, 2004
- J.M. Mateo-Sanz, A. Martinez-Balleste, and J.Domingo-Ferrer (2004). Fast Generation of Accurate Synthetic Microdata. Privacy in Statistical Databases. pp 298-306.
| Oct. 18 | Michael Lin
|
| Optional
| Utility and Risk in Tabular Protection [slides (pdf)]
| Required:
- Dobra, A., Fienberg, S. E. and Trottini, M. (2003). Assessing
the risk of disclosure of confidential categorical data.
Bayesian Statistics 7 (J. M. Bernardo, M. J. Bayarri,
J. O. Berger, A. P. Dawid, D. Heckerman, A. F. M. Smith and
M. West, eds.), Oxford University Press, 125-144.
Additional Reading:
| Oct. 18
| Juyoun Lee
|
| Required | Differential Privacy and Game Theory [slides .ppt,
.pptx, pdf] |
| Oct. 25 | Eric Shou
|
| Required
| Privacy for Social Network Data [slides]
| Required:
Additional Reading:
- Michael Hay, Gerome Miklau, David Jensen, Philipp
Weis, and Siddharth Srivastava. Anonymizing
Social Networks University of Massachusetts,
Amherst. Technical Report 2007.
- K. Frikken and P. Golle. Private
Social Network Analysis. In Workshop on Privacy
in the Electronic Society (WPES) 2006.
| Nov. 1 | Machigar Ongtang
|
| Required
| Software
| Required:
- μ-ARGUS and τ-ARGUS . Software and the manuals
can be found at CENEX
SDC.
- A. Hundepool (2006). The
ARGUS software in CENEX. Privacy in Statistical
Databases, Lecture Notes in Computer Science. Springer
Berlin/Heidelberg, Vol. 4302. pp.334-346.
Optional:
| Nov. 1 | Venkatesh S. Amudhan
|
| Optional topics |
|---|
| Optional
| Privacy in Distributed Databases
| Required:
- Yehuda Lindell and Benny Pinkas. Privacy
Preserving Data Mining. Advances in Cryptology --
Crypto '00 Proceedings, LNCS 1880, Springer-Verlag,
pp. 20-24, August 2000. A full version appeared in the
Journal of Cryptology, Volume 15 - Number 3, 2002.
-
Karr, A. F., Lin, X., Sanil, A. P., and Reiter,
J. P. (2006), Secure
statistical analysis of distributed databases, in
Statistical Methods in Counterterrorism: Game Theory,
Modeling, Syndromic Surveillance, and Biometric
Authentication. Edited by A. Wilson, G. Wilson, and
D. Olwell. New York: Springer, 237 - 262.
Additional:
| Nov. 15 | Ge Ruan
|
| Optional | Modern variants of randomized response |
Required:
Additional Reading:
| Nov. 8 | Adam Smith
|
| Optional
| Utility and Risk in Microdata [slides (pdf)]
| Required:
Additional Reading:
- Karr, A. F., Kohnen, C. N., Oganian, A., Reiter, J. P. and Sanil, A. P. (2006). A framework for evaluating the utility of data altered to protect confidentiality. The American Statistician, 60, pp. 224 - 232.
- Peter-Paul de Wolf (2006). Risk, Utility and PRAM. Privacy in Statistical Databases, Lecture Notes in Computer Science. Springer Berlin/Heidelberg, Vol. 4302. pp.189-204.
- Loredana Di Consiglio and Silvia Polettini (2006). Improving Individual Risk Estimators. Privacy in Statistical Databases, Lecture Notes in Computer Science. Springer Berlin/Heidelberg, Vol. 4302. pp.243-256.
| Dec. 6 | Juyoun Lee
|
| Optional
| Remote Access Servers
|
S. Gomatam, A. F. Karr, J. P. Reiter and A. P. Sanil. "Data Dissemination
and Disclosure
Limitation in a World Without Microdata: A Risk
- S. Gomatam, A. F. Karr, J. P. Reiter and
A. P. Sanil. Data
Dissemination and Disclosure Limitation in a World
Without Microdata: A Risk-Utility Framework for Remote
Access Analysis Server. Statistical Science, 20,
pp. 163 - 177.
- Reiter, Jerome and Kohnen, Christine. Categorical
data regression diagnostics for remote access
servers, Journal of Statistical Computation and
Simulation, 75, pp. 889 - 903.
| Nov. 29 | Michael Lin
|
| Optional
| Methods for Tabular Data Protection
| Required:
Additional Reading:
|
| Optional
| Methods for Microdata Protection [slides (pdf)]
| Required:
Additional Reading:
| Nov. 15 | Eric Shou
|
| Optional
| Case Studies [slides (pdf)]
| Optional:
| Nov. 29 | Machigar Ongtang
|
| Optional
| Auditors and Query Restriction
| Required:
- Krishnaram Kenthapadi, Nina Mishra, Kobbi Nissim:
Simulatable Auditing. Principles of Distributed
Computing (PODS) 2005, p. 118-127.
- Nabil R. Adam and John C. Wortmann. Security-control
methods for statistical databases: a comparative
study. ACM Computing Surveys, Vol. 21, No. 4, December
1989.
( NB: This paper covers a broad
variety of techniques, but the idea is to cover only
those aspects directly relevant to query auditing. The
most relevant part is Section 3 (pages 526-534),
although the introduction is also very
helpful.)
Optional:
- Vitaly Shmatikov's slides: .ppt
- Shubha Nabar, Bhaskara Marthi, Krishnaram
Kenthapadi, Nina Mishra and Rajeev Motwani. Towards
Robustness in Query Auditing. 32nd International
Conference on Very Large Data Bases (VLDB). 2006.
| Dec. 6 | Venkatesh S. Amudhan
|
| Optional
| Differential Privacy and tabular releases
| Required:
|
| Optional
| Lower bounds on the utility of
private data analysis
| Required:
|