# Theoretical Computer Science Group

## Theory Seminar Schedule for Fall 2012

The weekly Theory seminar provides an opportunity to get exposed to cutting edge research in theoretical computer science, to learn to understand technical material from talks and to give technical presentations. Graduate students interested in doing research in algorithms, complexity theory, cryptography, quantum computing and related areas are strongly encouraged to attend.

Date Speaker Title
Aug 27 Madhav Jha Counting Triangles in Data Stream
Sep 10 Or Sheffet The Johnson-Lindenstrauss Transform Itself Preserves Differential Privacy
Sep 17 Kashyap Dixit Testing Lipschitz Property over Distributions and its Applications to Statistical Data Privacy
Sep 24 Adam Smith He Said that She said that He said... : Open Problems and a Few Results on Robust Secret Sharing
Oct 4 Yevgeniy Dodis Overcoming Weak Expectations
Oct 9 Fang Song Quantum Random Oracles in Cryptography
Oct 16 Paul Medvedev The Plane Width of Graphs
Nov 5 Grigory Yaroslavtsev Learning and Testing Submodular Functions
Nov 12 Sandeep Narayanaswami Local Hidden Variable Theories, Bell Inequalities, and Secure Quantum Key Distribution
Dec 3 Megan Heysham Starlike trees are determined by their Laplacian spectrum
Dec 10 Raef Bassily Coding for Adversarial Causal Channels
Dec 14 Zhenyao Li The Rectilinear Steiner Arborescence Problem
• All talks will be at 3:35 pm on Mondays in 101 Leonhard (unless they are part of the departmental colloquium).
• If you would like to speak, please email us at: theory [at] cse [dot] psu [dot] edu.
• If you want to receive announcements of upcoming talks, please join the theory mailing list.
Archives

## Counting Triangles in Data Stream

We study the problem of counting the number of triangles in a graph in the streaming setting. Specifically, given an input graph as a stream of edges (in an arbitrary order), we want to accurately estimate the total number of triangles after processing the entire stream of edges exactly *once*. We would like to do this while storing only a small fraction of the stream. The problem of counting triangles in large graphs is a classic problem with numerous applications.

We give a 1-pass streaming algorithm which stores a small fraction of the edges and outputs an estimate to the number of triangles that is accurate up to a small additive error. The additive error can be arbitrarily decreased at the cost of more storage. This is a significant improvement from all previous work that provide answers with fairly larger error (using more space). We combine probabilistic ideas from the streaming and sublinear algorithms literature to design our final algorithm.

This is a work in progress with Sesh Comandur pursued during internship at Sandia National Labs. We have worked out most of the theory behind our algorithm and are currently validating our results on real networks. Initial results seem quite promising: our algorithm provides answers with < 5% additive error by storing only 1% of edges of the graph.

## The Johnson-Lindenstrauss Transform Itself Preserves Differential Privacy

We show that an old dog'', namely -- the classical Johnson-Lindenstrauss transform, performs new tricks'' -- it gives a novel way of preserving differential privacy. We show that if we take two databases, D and D', such that (i) D'-D is a rank-1 matrix of bounded norm and (ii) all singular values of D and D' are sufficiently large, then multiplying either D or D' with a vector of iid normal Gaussians yields two statistically close distributions in the sense of differential privacy. Furthermore, a small, deterministic and public alteration of the input is enough to assert that all singular values of D are large.
We apply the Johnson-Lindenstrauss transform to the task of approximating cut-queries: the number of edges crossing a (S,\bar S)-cut in a graph. We show that the JL transform allows us topublish a sanitized graph that preserves edge differential privacy (where two graphs are neighbors if they differ on a single edge) while adding only O(|S|/\epsilon) random noise to any given query (w.h.p). Comparing the additive noise of our algorithm to existing algorithms for answering cut-queries in a differentially private manner, we outperform all others on small cuts (|S| = o(n)).
We also apply our technique to the task of estimating the variance of a given matrix in any given direction. The JL transform allows us to publish a sanitized covariance matrix that preserves differential privacy w.r.t bounded changes (each row in the matrix can change by at most a norm-1 vector) while adding random noise of magnitude independent of the size of the matrix (w.h.p). In contrast, existing algorithms introduce an error which depends on the matrix dimensions.

## Testing Lipschitz Property over Distributions and its Applications to Statistical Data Privacy

In this work, we present a connection between Lipschitz property testing and a relaxed notion of differential privacy, where we assume that the datasets are being sampled from a domain according to some distribution defined on it. Specifically, we show that testing whether an algorithm is private can be reduced to testing Lipschitz property in the distributional setting.

We also initiate the study of *distribution* Lipschitz testing. We present an efficient Lipschitz tester for the hypercube domain when the "distance to property" is measured with respect to product distribution. Most previous works in property testing of functions (including prior works on Lipschitz testing) work with uniform distribution.

## He Said that She said that He said... : Open Problems and a Few Results on Robust Secret Sharing

We'll discuss the problem of "robust" secret sharing. As in normal secret sharing, a dealer distributes a secret to a group of n people ("players") in way that no set of t players can learn anything about the secret. Normally, the only completeness requirement is that any t+1 players can reconstruct the secret from their shares. In robust secret sharing, there is also a robustness requirement: it should be possible to reconstruct the secret given all n shares even when n-t of those shares have been modified.

This is a classic problem related to multi-party computation and "perfectly secure message transmission". We'll discuss the state of the art as well as some open questions and a conjectured (partial) solution to one of the questions.

## Overcoming Weak Expectations

Recently, there has been renewed interest in basing cryptographic primitives on weak secrets, where the only information about the secret is some non-trivial amount of (min-) entropy. The security proofs of these primitives require an upper bound the expectation of some function f(X), where X is a weak source in question. We show an elementary inequality which upper bounds such ‘weak expectation’ by two terms, the ?rst of which is independent of f, while the second only depends on the ‘variance’ of f under uniform distribution. As relatively simple corollaries of this elementary inequality, we obtain some ‘unexpected’ results, in several cases noticeably simplifying/improving prior techniques for the same problem. Examples include non-malleable extractors, leakage-resilient symmetric encryption, alternative to the dense model theorem, seed-dependent condensers and improved entropy loss for the leftover hash lemma.

## Quantum Random Oracles in Cryptography

Random oracle (RO) model is widely used in classical cryptography to design efficient crypto-schemes, e.g., encryption and signature. However, when we consider a quantum version of random oracle (QRO), where players have the potential power of querying the oracle in superposition, security of existing schemes becomes unclear, and many proof techniques need re-examining. In this talk, I will introduce the RO/QRO model, and give a concrete example, called Full Domain Hash (FDH), that constructs secure signature schemes from trapdoor permutations in RO model. I will use this example to demonstrate proof techniques in (classical) RO model and the difficulties of extending such techniques in QRO model. Then, by developing a new indistinguishability result of distributions against quantum computers, I will show how to get around these difficulties and prove security of FDH in QRO model. Finally, I will discuss a few open questions regarding which classical RO results can/cannot hold in QRO model.

## The Plane Width of Graphs

Map the vertices of a graph to (not necessarily distinct) points of the plane so that two adjacent vertices are mapped at least unit distance apart. The plane-width of a graph is the minimum diameter of the image of its vertex set over all such mappings. We establish a relation between the plane-width of a graph and its chromatic number. We also connect it to other well-known areas, including the circular chromatic number and the problem of packing unit discs in the plane.

## Learning and Testing Submodular Functions

Submodular functions capture the law of diminishing returns and can be viewed as a generalization of convexity to functions over the Boolean cube. Such functions arise in different areas, such as combinatorial optimization, machine learning and economics. In this talk we will focus on positive results about learning such functions from examples and testing whether a given function is submodular with a small number of queries. For the class of submodular functions taking values in discrete integral range of size R we show a structural result, giving concise representation for this class. The representation can be described as a maximum over a collection of threshold functions, each expressed by an R-DNF formula. This leads to efficient PAC-learning algorithms for this class, as well as testing algorithms with running time independent of the size of the domain.

## Local Hidden Variable Theories, Bell Inequalities, and Secure Quantum Key Distribution

Local hidden variable theories were introduced as an attempt to circumvent the non-deterministic conclusions of quantum mechanics. However, these have been shown to be incompatible with measurement statistics on quantum systems. Bell's inequalities give us a way of bounding correlations among systems in local hidden variable theories; we look at how these inequalities are violated by quantum systems. Violations of Bell's inequalities are also a resource: we discuss a key distribution protocol from the paper No Signalling and Quantum Key Distribution'' (J. Barrett, L. Hardy, and A. Kent, 2005) that is secure under the sole assumption of no instantaneous communication; the proof of security depends on violations of a Bell inequality.

## Starlike trees are determined by their Laplacian spectrum

The speaker will present a result of G.R. Omidi and K. Tajbakhsh, namely that starlike trees are determined by their spectrum. One goal of spectral graph theory is to classify graphs according to the spectrum of their Laplacian matrix, and understand which graphs are determined uniquely (up to isomorphism) by their spectrum. Recall that the Laplacian L(G) is the matrix deg(G)-A(G), where deg(G) is a diagonal matrix whose entries are the degrees of vertices in G, and A(G) is the adjacency matrix. The Laplacian spectrum of G is the multiset of eigenvalues of L(G). A graph is said to be determined by its Laplacian spectrum if there is no other non-isomorphic graph with the same Laplacian spectrum. Currently there are few families of graphs that are determined by the Laplacian spectrum. A tree is called starlike if it has exactly one vertex of degree greater than two (it thus consists of a set of chains attached to a common central vertex). The family of starlike trees is determined by the Laplacian spectrum. That is, if a starlike tree G and an arbitrary graph G' have the same spectrum, then G' is isomorphic to G.

## Coding for Adversarial Causal Channels

In this talk, we consider the communication of information in the presence of a causal adversarial jammer. In the setting under study, a sender wishes to communicate a message to a receiver by transmitting a codeword x = (x1, ... ,xn) bit-by-bit over a communication channel. The sender and receiver do not share common randomness. The adversarial jammer can view the transmitted bits x_i one at a time, and can change up to a p fraction of them. However, the decisions of the jammer must be made in a causal manner. Namely, for each bit x_i the jammer's decision on whether to corrupt it or not (and on how to change it) must depend only on x_j for j=i. This is in contrast to the classical adversarial jamming situations in which the jammer has no knowledge of x or complete knowledge of x. In this talk, upper and lower bounds on the capacity of such communication channels will be presented. The upper bound was derived in [B.K. Dey - S. Jaggi - M. Langberg - A.D. Sarwate, ISIT2012], whereas the lower bound, which beats the well known Gilbert-Varshamov bound, was obtained in [I. Haviv - M. Langberg, ISIT2011].

## Coding for Adversarial Causal Channels

In this talk, we consider the communication of information in the presence of a causal adversarial jammer. In the setting under study, a sender wishes to communicate a message to a receiver by transmitting a codeword x = (x1, ... ,xn) bit-by-bit over a communication channel. The sender and receiver do not share common randomness. The adversarial jammer can view the transmitted bits x_i one at a time, and can change up to a p fraction of them. However, the decisions of the jammer must be made in a causal manner. Namely, for each bit x_i the jammer's decision on whether to corrupt it or not (and on how to change it) must depend only on x_j for j=i. This is in contrast to the classical adversarial jamming situations in which the jammer has no knowledge of x or complete knowledge of x. In this talk, upper and lower bounds on the capacity of such communication channels will be presented. The upper bound was derived in [B.K. Dey - S. Jaggi - M. Langberg - A.D. Sarwate, ISIT2012], whereas the lower bound, which beats the well known Gilbert-Varshamov bound, was obtained in [I. Haviv - M. Langberg, ISIT2011].

## The Rectilinear Steiner Arborescence Problem

Given a set N of n nodes lying in the first quadrant of the real plane, the goal of the Rectilinear Steiner Arborescence problem (RSA) is to find a minimum length directed tree rooted at the origin and containing all nodes in N. The tree should be "rectilinear", that is, composed solely of horizontal arcs oriented from left to right and vertical arcs oriented from bottom to top. Rao et al (1992) introduced the problem and gave a polynomial-time 2-approximation algorithm. Many subsequent approaches for improving the algorithm were proposed. We will discuss one branch and bound algorithm that runs in exponential time but outputs an exact solution. Another approximation approach restricts the branching to make the algorithm polynomial time. Experiments show that it performs well in practice. The talk will cover these basic algorithms and, to the extent possible, their analysis.