![]() |
Assistant Professor (James L. Henderson Memorial Professorship) Department of Computer Science & Engineering Penn State 360F IST Building University Park, PA 16802
Fax: 814-865-7647 |
What is statistical privacy and why is it important? Many organizations like the government, Yahoo!, Google, Microsoft, AT&T, Amazon.com, Netflix, etc. collect data about you and me. In some cases (such as the Census Bureau), some of this information must be shared with the public. In other cases, sharing data is not required. In these situations, many companies do not have the resources to fully analyze their data. By making some version of their data publicly accessible, those companies could spur research that would help them improve their own products and services. A good example is the Netflix prize - it is very likely that this contest helped Netflix improve their recommendation system. The important question is how to release useful information about the data without violating our privacy. To answer it, we need to figure out what "privacy" and "useful" actually mean. These are some of the problems that I am investigating with the help of my students.
My research is funded by NSF award #1054389
Daniel Kifer and Ashwin Machanavajjhala. No Free Lunch in Data Privacy. to appear in SIGMOD 2011.
Qi He, Daniel Kifer, Jian Pei, Prasenjit Mitra, C. Lee Giles. Citation Recommendation without Author Supervision. WSDM 2011.
Daniel Kifer and Bing-Rong Lin. An Axiomatic View of Statistical Privacy and Utility. to appear in the Journal of Privacy and Confidentiality. This is a much more user-friendly version of our PODS paper with new results and axioms.
Bi Chen, Leilei Zhu, Daniel Kifer and Dongwon Lee. What is Opinion About? Exploring Political Standpoints using Opinion Scoring Model. AAAI 2010.
Daniel Kifer and Bing-Rong Lin. Towards an Axiomatization of Statistical Privacy and Utility. PODS 2010. (technical report)(slides) (FAQ)
Qi He, Jian Pei, Daniel Kifer, Prasenjit Mitra, C. Lee Giles. Context-aware Citation Recommendation. WWW 2010.
Bee-Chung Chen, Daniel Kifer, Kristen LeFevre, Ashwin Machanavajjhala. Privacy-Preserving Data Publishing. Foundations and Trends in Databases, NOW publishers, 2009. (official link)
Daniel Kifer. Attacks on Privacy and de Finetti's Theorem. SIGMOD 09. code
Daniel Kifer. Change Detection on Streams. Encyclopedia of Database Systems, Springer 2009.
Parag Agrawal, Daniel Kifer, Christopher Olston. Scheduling Shared Scans of Large Data Files. Proceedings of the 34th International Conference on Very Large Databases (VLDB 2008).
Ashwin Machanavajjhala, Daniel Kifer, John Abowd, Johannes Gehrke, and Lars Vilhuber. Privacy: Theory meets Practice on the Map. Proceedings of the 24th International Conference on Data Engineering (ICDE 2008) The technique presented in this paper is currently being used by the US Census Bureau for OnTheMap V3
David Martin, Daniel Kifer, Ashwin Machanavajjhala, Johannes Gehrke, and Joseph Halpern. Worst-Case Background Knowledge for Privacy-Preserving Data Publishing. Proceedings of the 23rd International Conference on Data Engineering (ICDE 2007).
Ashwin Machanavajjhala, Daniel Kifer, Johannes Gehrke, and Muthuramakrishnan Venkitasubramaniam. l-Diversity: Privacy Beyond k-Anonymity. ACM Transactions on Knowledge Discovery from Data (TKDD). Volume 1, Issue 1, March 2007.
Daniel Kifer, J. E. Gehrke, Injecting Utility into Anonymized Datasets. Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data (SIGMOD 2006).
Ashwin Machanavajjhala, Johannes Gehrke, Daniel Kifer, and Muthuramakrishnan Venkitasubramaniam. l-Diversity: Privacy Beyond k-Anonymity. Proceedings of the 22nd IEEE International Conference on Data Engineering (ICDE 2006), Atlanta Georgia, April 2006.
Daniel Kifer, Shai Ben-David, and Johannes Gehrke. Detecting Change in Data Streams. Proceedings of the 30th International Conference on Very Large Data Bases (VLDB 2004). Toronto, Canada. August 2004.
Manuel Calimlim, Jim Cordes, Alan Demers, Julia Deneva, Johannes Gehrke, Dan Kifer, Mirek Riedewald, and Jayavel Shanmugasundaram. A Vision for PetaByte Data Management and Analysis Services for the Arecibo Telescope. Bulletin of the Technical Committee on Data Engineering, IEEE Computer Society. Volume 27, Number 4, December 2004.
Daniel Kifer, J. E. Gehrke, Cristian Bucila, Walker White. How to Quickly Find a Witness. Constraint-based mining and inductive databases. Editors: Jean-Francois Boulicaut, Luc de Raedt, Heikki Mannila. Lecture Notes in Computer Science. 2004.
Cristian Bucila, J. E. Gehrke, Daniel Kifer, and Walker White. DualMiner: A Dual-Pruning Algorithm for Itemsets with Constraints. Data Mining and Knowledge Discovery (Special Issue: Selected Papers from the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining -- Part I), Vol. 7, Issue 4, July 2003, pages 241-272.
Daniel Kifer, J. E. Gehrke, Cristian Bucila, Walker White. How to Quickly Find a Witness. Proceedings of the 22nd ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS 2003). San Diego, CA, June 2003
Cristian Bucila, J. E. Gehrke, Daniel Kifer, and Walker White. DualMiner: A Dual-Pruning Algorithm for Itemsets with Constraints. Proceedings of the Eighth ACM SIGKDD Internation Conference on Knowledge Discovery and Data Mining. Edmonton, Alberta, Canada, July 2002.
CMPSC 497B: Introduction to Machine Learning (Spring 2010)
CSE 597D: Conformal Prediction (Spring 2010)
CSE 598A: Machine Learning (Fall 2009)
CMPSC 431W: Database Management Systems (Spring 2009)