Theoretical Computer Science Group

Theory Seminar Schedule for Fall 2014

The weekly Theory seminar provides an opportunity to get exposed to cutting edge research in theoretical computer science, to learn to understand technical material from talks and to give technical presentations. Graduate students interested in doing research in algorithms, complexity theory, cryptography, quantum computing and related areas are strongly encouraged to attend.

Date Speaker Title
August 25 Sofya Raskhodnikova L_p testing
September 1 Labor day (no talk)
September 8 Raef Bassily Private Empirical Risk Minimization: Efficient Algorithms and Tight Error Bounds
September 15 Kashyap Dixit Optimal Property Testers Over Product Distributions
September 22 Nithin Varma Pairwise spanners with additive distortion
September 29 (in IST 333) Krzysztof Onak (IBM Watson) A Near-Optimal Algorithm for Testing Graph Isomorphism
October 6 Meiram Murzabulatov Sample-Based Constant-Time Algorithms for Properties of Images
October 13 (in IST 333) Abhradeep Thakurta (Yahoo Research) On the use of regularization in deep neural networks
October 20 FOCS (no talk)
October 27 Heqing Huang RSA Key Extraction via Low-Bandwidth Acoustic Cryptanalysis
November 3 Om Thakkar TBA
November 10 (in IST 333) Aaron Roth (University of Pennsilvania) TBA
November 17 (in IST 333) Jon Ullman (Columbia University) TBA
November 24 Thanksgiving (no talk)
December 1 Eunou Lee TBA
December 8 Ramesh Krishnan TBA

Spring 2007     Fall 2007     Spring 2008     Summer 2008     Fall 2008     Spring 2009     Fall 2009     Spring 2010     Fall 2010     Spring 2011     Fall 2011

    Spring 2012     Fall 2012     Spring 2013     Fall 2013

L_P Testing

We initiate a systematic study of sublinear-time algorithms for approximately testing properties of real-valued data, where the quality of approximation is measured with respect to L_p distances. L_p distances generalize the standard Euclidian distance (L_2) and the Hamming distance (L_0). By real-valued data, we mean datasets that are meaningfully represented as real-valued functions. Our algorithms, called L_p-testers, are required to distinguish datasets that have a required property from those that are far in L_p distance from any dataset with the required property. This is a generalization of standard property testers, which are defined with respect to the Hamming distance. Using general L_p distances is more appropriate for many computational tasks on real-valued data.

We use our framework to design simple and fast algorithms for classic problems, such as testing monotonicity, convexity and the Lipschitz property, and also approximating the distance to monotonicity. For some of these problems, our L_p-testers for p>=1 are faster than possible in the standard property testing framework.

In the talk, we will explain and motivate the new framework, give an overview of the results, and explain some of the techniques.

Joint work with Piotr Berman and Grigory Yaroslavtsev, STOC 2014.

Private Empirical Risk Minimization: Efficient Algorithms and Tight Error Bounds

The volume of high-quality data that we can access, analyze, and learn from has reached astronomical figures. Since these data often contain sensitive or personally identifiable information, privacy has become a subject of a major concern. As it is hard to intuitively reason about the notion of privacy, we were compelled to use the rigorous mathematical approach for this matter. The notion of differential privacy [Dwork-McSherry-Nissim-Smith, 2006] is one of the most powerful and widely-accepted mathematical definitions of privacy. Roughly speaking, a (randomized) algorithm that takes a data set as an input is said to be differentially private if, for any pair of data sets D and D’ that differ only in one record, the distribution of the algorithm’s output remains ``almost’’ the same whether it is run on D or D’.

In this talk, we consider the problem of Empirical Risk Minimization (ERM) under the differential privacy constraint. The notion of ERM is at the heart of learning theory and statistics, and is a central tool in machine learning algorithms. In the traditional convex ERM, we are given a data set {d_1, …, d_n}, together with a convex set C known as the parameter set. For a given parameter w in C, each data point d_i contributes a loss L(w; d_i). Given a data set {d_1,..., d_n}, the algorithm's goal is to find a parameter w that minimizes the empirical risk, defined as \sum_{i=1}^n L(w;d_i).

In this talk, we show how to construct efficient differentially private algorithms with optimal excess risk (i.e., the least amount of error necessary to satisfy the differential privacy constraint). Namely, we provide efficient algorithms and matching lower bounds for (i) the "general" setting where we only assume the loss function is convex and Lipschitz-continuous and the convex constraint set is bounded; and (ii) the setting where the loss function is also known to be strongly convex. We give separate algorithms (and lower bounds) for the two standard cases of differential privacy: pure $\epsilon$ and $(\epsilon, \delta)$ differential privacy. Perhaps, surprisingly, the techniques used for designing optimal algorithms in the two cases are completely different.

*Based on joint work with Adam Smith and Abhradeep Thakurta - To appear in FOCS 2014.

Optimal Property Testers Over Product Distributions

We study algorithms that, given access to a small number of samples from a large dataset, approximately determine whether the dataset satisfies a desired property. More specifically, our algorithms accept if the dataset has the property and reject with high probability if the dataset is far from having the property. The distance to having the property is measured with respect to a known or an unknown distribution on the datasets. Since any dataset can be represented as a function, we study testing properties of functions. In this work, we focus on functions over domains of the form {1,2,..,n}^d (that is, d-dimensional hypergrids). We look at a general class of properties of such functions, called bounded derivative properties (BDP), which includes monotonicity and the Lipschitz property. These two properties are fundamental and have applications in processing database queries and data-privacy, respectively.

We give an optimal tester for BDPs for the case when the distance to the property is measured with respect to a product distribution, that is, a distribution where each coordinate is chosen independently. Our main tool here is a novel dimension reduction which reduces testing properties of functions over {1,2,..,n}^d to testing functions over {1,2,..,n}. This dimension reduction is optimal up to a constant factor. For BDPs of functions over {1,2,...,n}, we design an optimal tester for the case when the distribution is known. Our tester is based on Knuth's construction of binary search trees over {1,2,..,n} with minimum expected depth. As a special case, we obtain an optimal monotonicity tester for {1,2,..,n}, thus improving the tester given by Ailon and Chazelle (Information and Computation, 2006). Our work resolves two open problems given in their work.

Joint work with Deeparnab Chakrabarty, Madhav Jha, and C.Seshadhri (to appear in SODA 2015)

Pairwise spanners with additive distortion

An additive pairwise spanner of an undirected graph is a subgraph in which the shortest distances between certain specified pairs of nodes are larger than the corresponding distances in the original graph by at most an additive term. This additive term is called the distortion of that spanner. In the case when the approximation requirement applies to all pairs of nodes in the graph, the subgraph is simply called an additive spanner.

Additive spanners are very well studied. They have applications in compact routing schemes, near shortest path algorithms, approximate distance oracles, etc. It is NP-hard to compute the sparsest additive pairwise spanner of a graph. The typical goal is to design algorithms which for specific values of distortion construct pairwise spanners which are as sparse as possible.

In the talk we will present a brief survey of the work done in the area of graph spanners, the important algorithmic techniques used in constructing them and high level ideas of some of our algorithms for additive pairwise spanners.

Joint work with T. Kavitha.

A Near-Optimal Algorithm for Testing Graph Isomorphism

The goal of the graph isomorphism problem is to check if two finite graphs are identical if their labels are ignored. Apart from practical applications such as chemical compound identification and circuit verification, the problem is interesting due to its challenging open computational status.

In the property testing framework, Fischer and Matsliah (SICOMP 2008) showed that the number of queries necessary to (approximately) test isomorphism of two unknown graphs in the dense graph model is at least Omega(n) and at most O(n^(5/4)). We essentially close this gap by showing an algorithm that makes n * 2^O(sqrt(log n)) queries.

Distribution testing (Batu et al. [JACM 2013]; Batu et al. [FOCS 2001]) is one of the main tools in this line of research. A major obstacle in the quest for an efficient graph isomorphism tester turns out to be the Omega(n^(2/3)) distribution testing lower bound due to Valiant (SICOMP 2011). We bypass this lower bound by designing a more efficient algorithm that for two distributions on near-isomorphic sets of points is required to reject only if the Earth-Mover Distance between them is large.

Joint work with Xiaorui Sun (Columbia University).

Sample-Based Constant-Time Algorithms for Properties of Images

We initiate a systematic study of sublinear-time algorithms for image analysis that have access only to labeled random samples from the input. Most previous sublinear-time algorithms for image analysis were query-based, that is, they could query pixels of their choice. We consider algorithms with two types of input access: sample-based algorithms that draw independently random pixels, and block-sample-based algorithms that draw pixels from independently random square blocks of the image. We investigate three basic properties of black-and-white images: being a half-plane, convexity and connectedness. For the first two properties, all our algorithms are sample-based, and for connectedness they are block-sample-based. All algorithms we present have low sample complexity that depends only on the error parameter, but not on the input size.

We design algorithms that approximate the distance to the three properties within a small additive error or, equivalently, tolerant testers for being a half-plane, convexity and connectedness. Tolerant testers for these properties, even with query access to the image, were not investigated previously. Tolerance is important in image processing applications because it allows algorithms to be robust to noise in the image. We also give (non-tolerant) testers for these properties with better complexity than implied by our distance approximation algorithms. For convexity and connectedness, our testers are faster than previously known query-based testers.

To obtain our algorithms for convexity, we design two fast proper PAC learners of convex sets in two dimensions that work under the uniform distributions: non-agnostic and agnostic.

Joint work with Piotr Berman and Sofya Raskhodnikova.

On the use of regularization in deep neural networks

A deep neural network (DNN) is a classification algorithm described by many layers of threshold gates arranged in sequence. DDNs have recently become hugely popular, both in industry and among academics, mostly due to their stellar success in computer vision and speech processing. Unfortunately, training these DNNs for a new learning application is notoriously hard, partly because of the nonconvex optimization problems that arise. In contrast, 'the optimization problems that arise in training more traditional classification models are convex, and hence relatively easy to solve.

Dropout regularization (Hinton et al, 2012) is a heuristic designed to help in the training of DNNs. Dropout significantly reduces the classification error of DNNs on many benchmark data sets, and is now widely used in practice. However, despite recent efforts (Baldi and Sadowski, 2013; Wager et al., 2013), dropout’s theoretical properties remain poorly understood. Until this work there were no rigorous guarantees on its effect on classification error.

We provide the first proof that the generalization error of a neural network trained using dropout regularization goes to 0 as the number of training examples tends to infinity. Our proof applies to a one-layer neural network. Moreover, we provide strong stability (robustness) guarantees for dropout and relate them to differential privacy, a definition of privacy for statistical databases. Finally, we provide experimental evidence supporting our theoretical claims.

In this talk, I will introduce deep neural networks and the dropout heuristic in its general form. I will explain our results on the error and the robustness of dropout in the context of one-layer neural networks, and relate them to differential privacy. No prior knowledge of neural networks or differential privacy is assumed.

Joint work with Prateek Jain [Microsoft Research India], Vivek Kulkarni [SUNY, Stony Brook], and Oliver Williams.

RSA Key Extraction via Low-Bandwidth Acoustic Cryptanalysis

RSA is one of the most popular asymmetric crypto schemes, which is considered a very strong and secure crypto scheme given long keys (e.g., 2048 bits). However, Daniel Genkin, Adi Shamir, and Eran Tromer found that due to the vibration in some of the computer devices' electronic components, a high-pitched noise is emitted during the RSA's decryption/signature process. In this talk, we will discuss how to leverage these acoustic emanations to "hear" the sensitive information about security-related computations.

In the discussed paper, a new acoustic cryptanalysis key extraction method based on chosen ciphertext attack is designed and implemented against GnuPG’s current implementation of RSA. The attack can extract full 4096-bit RSA decryption keys from laptop computers by sending email to the Enigmail software which uses GnuPG RSA for email content decryptions/signature. Their experiment result shows that the designed attacks can be carried out, using either a normal mobile phone placed next to the computer, or a more sensitive microphone placed 4 meters away.

As background knowledge, we will also explain the RSA decryption / signature procedure and its relevant optimization approach based on the Chinese Remainder Theorem.