The weekly Theory seminar provides an opportunity to get exposed to
cutting edge research in theoretical computer science, to learn to
understand technical material from talks and to give technical
presentations. Graduate students interested in doing research in
algorithms, complexity theory, cryptography, quantum
computing and related areas are strongly encouraged to attend.
All talks will be at 2:30 pm on Mondays in 118 Earth and Engineering Science Building (unless they
are part of the departmental colloquium). Any unusual date/time/location is highlighted in the schedule.
If you would like to speak,
please email us at: theory [at] cse [dot] psu [dot] edu.
If you want to receive announcements of upcoming talks, please
join the theory
mailing list.
Analyzing Graphs with Node Differential Privacy
We develop algorithms for the private analysis of network data that
provide accurate analysis of realistic networks while satisfying stronger
privacy
guarantees than those of previous work. We present several techniques for
designing
node differentially private algorithms, that is, algorithms whose output
distribution does not change significantly when a node and all its adjacent
edges
are added to a graph. We also develop methodology for analyzing the accuracy
of such algorithms on realistic networks.
The main idea behind our techniques is to “project” (in one of several
senses)
the input graph onto the set of graphs with maximum degree below a certain
threshold. We design projection operators, tailored to specific statistics
that have
low sensitivity and preserve information about the original statistic.
These operators
can be viewed as giving a fractional (low-degree) graph that is a solution
to
an optimization problem described as a maximum flow instance, linear
program,
or convex program. In addition, we derive a generic, efficient reduction
that allows
us to apply any differentially private algorithm for bounded-degree graphs
to an arbitrary graph. This reduction is based on analyzing the smooth
sensitivity
of the “naive” truncation that simply discards nodes of high degree.
Beating the Direct Sum Theorem in Communication Complexity
A two-party communication problem is given by a function f(x,y), where x and y are held by two different parties (Alice and Bob) connected via a communication channel. The goal of the parties is to compute f(x,y) while minimizing the communication through the channel on the worst-case pair of inputs (x,y). Randomized communication complexity of f with error probability \eps, denoted R_{\eps}(f), is the minimum communication required for a randomized protocol which computes f with error probability at most \eps. Results in the area of communication complexity are used to prove strong information-theoretic lower bounds in different areas of computer science, ranging from data structures to circuit complexity.
A direct sum theorem for two parties and a function f states that the communication cost of solving k instances of f simultaneously (computing a vector f(x_1,y_1), ... f(x_k, y_k)) with error probability 1/3 is at least k * R_{1/3}(f), where R_{1/3}(f) is the communication required to solve a single instance of f with error probability 1/3. We improve this for a specific family of functions f, showing that the communication required to solve k instances of f simultaneously with error probability 1/3 is (k R_{1/k}(f)). Since R_{1/k}(f) may be as large as R_{1/3}(f) * log k, we asymptotically beat the standard direct sum bound for such functions. This shows that the trivial upper bound which solves each of the k instances of f with probability 1 - O(1/k) is optimal (correctness follows from the union bound).
Our results imply optimal communication/space lower bounds for several sketching problems (e.g. Johnson-Lindenstrauss transform) in a setting when the algorithm should be correct on a sequence of k queries.
Joint work with Marco Molinaro and David Woodruff.
The Regularity of Lossy RSA on Subdomains and its Applications
Rivest, Shamir and Adleman's conjecture, that exponentiation modulo
composite numbers is hard to invert, is the basis of many modern
encryption and signature schemes. However, the security of most of the
schemes resides on (apparently) stronger assumptions than simply the
difficulty of inverting exponentiation. For example, it is often
assumed that certain specific bits of the input are hard to compute
from the output.
In this paper, we investigate the relationship between several of
these stronger variants of the RSA assumption. Under the "phi-hiding
assumption" (which we define in the talk), we show that large
consecutive runs of input bits are hard to distinguish from random
given the output, and that "PKCS #1 v1.5" and "RSA-OAEP" (two
particular encryption schemes based on RSA) are secure under chosen
plaintext attacks.
We obtain these results by deriving new bounds on how evenly the
primitive e-th roots of unity mod N are distributed, for exponents e
that divide phi(N). The new bounds use both elementary techniques and
recent estimates of Gauss sums over finite subgroups. Our results
deepen the connection between ``combinatorial'' properties of
exponentiation in Z_N and the security of RSA-based constructions.
The talk will not assume any special background except familiarity
with modular arithmetic and basic group theory.
Joint work with Eike Kiltz, Mark Lewko and Adam O'Neill. Based on a
papers at Crypto 2010 and Eurocrypt 2013.
Fairness with Rational Adversaries
Two-party (or multi-party) computation allows users who each hold a
different part of the input to compute some joint function without
sharing their inputs with each other. Traditional security notions
have focused on guaranteeing that the output is correct and the inputs
are kept secret even if a party behaves maliciously. Fairness, the
requirement that no party can abort after learning the output and
prevent others from also learning the output, is impossible to achieve
in general. In this work, we model the interaction as a game, and
model the players as rational utility-maximizing agents. We find that
in this setting fairness can be achieved in as broad a set of
situations as one could hope for. This gives a new situation in which
a game-theoretic setting can be used to circumvent known impossibility
results.
Joint work with Jonathan Katz.
The Method of Multiplicities
Polynomials have played a fundamental role in the
construction of objects with interesting combinatorial properties,
such as error correcting codes, pseudorandom generators, randomness
extractors etc. Somewhat strikingly, polynomials have also been found
to be a powerful tool in the analysis of combinatorial parameters of
objects that have some algebraic structure. This method of analysis
has found applications in works on list-decoding of error correcting
codes, constructions of randomness extractors, and in obtaining
strong bounds for the size of Kakeya Sets. Remarkably, all these
applications have relied on very simple and elementary properties of
polynomials such as the sparsity of the zero sets of low degree
polynomials.
In this talk we will discuss improvements on several of
the results mentioned above by a more powerful application of
polynomials that takes into account the information contained in the
*derivatives* of the polynomials. We call this technique the
``method of multiplicities". The information about higher
multiplicity vanishings of polynomials, which is encoded in the
derivative polynomials, enables us to meaningfully reason about the
zero sets of polynomials of degree much higher than the underlying
field size. We will discuss some of these applications of the method
of multiplicities, to obtain improved constructions of error
correcting codes, and qualitatively tighter analyses of Kakeya sets,
and randomness extractors.
(Based on joint works with Zeev Dvir,
Swastik Kopparty, Madhu Sudan, Sergey Yekhanin)
Graph Sketching and Homomorphic Compression
Applying random linear projections, a.k.a. "sketching", has become a standard technique when analyzing high-dimensional data sets. The resulting algorithms are embarrassingly parallelizable and suitable for stream processing. However, existing results have largely focused on vector-based problems (e.g., estimating norms and reconstructing sparse signals) and linear-algebraic problems (e.g., regression and low-rank approximation). In this work, we ask whether the richer combinatorial structure of graphs can also be analyzed via small sketches. We present a range of algorithmic results and applications including the first single-pass algorithm for constructing graph sparsifiers of fully-dynamic graph streams (i.e., where edges can be added and deleted in the underlying graph).
Based on joint work with Kook Jin Ahn and Sudipto Guha (SODA 12, PODS 12).
The Discrepancy Method in Communication Complexity
The most commonly studied model in communication complexity theory is the following. There are two players, Alice and Bob. Alice holds input x, Bob holds input y, and neither knows the other's input. They want to compute some joint function f(x,y). In order to do that, they exchange information (bit by bit) according to some protocol chosen beforehand. Their goal is to minimize the communication complexity (that is, the number of bits communicated between them). Lower bounds on communication complexity are used in various fields of computer science, e.g., VLSI circuit design, data structures, optimization of computer networks and property testing. In this talk, we present the discrepancy method that can be used to prove lower bounds on randomized communication complexity, that is,
communication complexity over randomized protocols.
The material for this talk is taken from the book "Communication Complexity" by Eyal Kushilevitz and Noam Nisan.
Best known approximation algorithm for the s-t Path Traveling Salesman Problem
This talk is based on the paper by C. H. An, R. Kleinberg and D. Shmoys called "Improving Christofides algorithm for the s-t path TSP" from STOC 2012. It gives the best known approximation algorithm for a classical NP-complete problem: the s-t Path Traveling Salesman Problem (TSP) over an arbitrary metric. The best approximation algorithm for this problem prior to the work of An, Kleinberg and Shmoys was presented by Hoogeveen in 1991. It was based on Christofides' algorithm (for Circuit TSP) and had approximation ratio 5/3. The approach in [AKS12] uses a linear programming relaxation of the problem.
I will start by explaining Christofides' algorithm. Then I will state the algorithm from [AKS12]. I will present the analysis that gives an approximation ratio of 5/3 for the new algorithm. Then I will give a short overview of the improved analysis that achieves a 1.6577 approximation ratio for this algorithm. (In the paper, this analysis is further improved to get the best known approximation ratio of (1+sqrt(5))/2.)
Load Bounds for Two-Choice Hashing
When placing n balls in n bins sequentially, if each ball selects a bin uniformly at random, the resulting maximum load of any bin is Theta(ln n/ln ln n) with high probability. If each ball instead selects two bins and is placed in the least full bin, we can achieve better load-balancing. In this talk, I will prove that if each ball selects d >=2 bins, the resulting maximum load of any bin is Theta(ln ln n). This talk is based on a proof from the textbook "Probability and Computing" by Michael Mitzenmacher and Eli Upfal.
Testing monotonicity of Boolean functions
In this talk, we present the exciting new progress made by Chakrabarty and Seshadhri (STOC 2013) on the problem of Boolean monotonicity testing. This is a classic problem in the study of property testing of functions and has been studied extensively since 1998. In this work, the authors give an O(n^(5/6)) time algorithm for distinguishing Boolean monotone functions from functions which are "far" from monotone. In particular, this upcoming STOC 2013 paper establishes (the long conjectured result) that complexity of Boolean monotonicity testing is sublinear in dimension n.
Boolean monotonicity testing is a generalization of the problem of determining if an array is sorted to higher dimensions. More precisely, a function f : {0,1}^n - > {0,1} is monotone if for every pair of inputs x = (x_1, ..., x_n) and y = (y_1, ..., y_n) satisfying x_i <= y_i for all i, the following holds: f(x) <= f(y). The goal is to design a randomized algorithm which gets oracle access to f and has the following behavior. It accepts monotone functions and rejects (with probability at least 2/3) functions which are epsilon-far from monotone. A function is epsilon-far from monotone if it must be modified in epsilon-fraction of places to make it monotone.
A Subexponential Algorithm to Solve Discrete Logarithm Problem
Let G be a cyclic group generated by element g. The discrete logarithm
problem (DLP) is: given g and an element t in G, find an integer s
such that g^s=t. DLP is thought to be a hard problem in many groups,
and has many applications in cryptography (encryption and signature
schemes, zero-knowledge proof systems, and others). Assuming G has
size N, the brute force algorithm to search for DLP takes O(N) time.
We can improve the running time to O(\sqrt{N}) without any restriction
on G. However, both of these algorithms take time exponential in
log(N). In this talk, I will present a randomized algorithm to solve
DLP over Z*_p (for prime p) that runs in time exp(\sqrt{log p log log
p}), which is subexponential in N=log p. The content of this talk is
based on Chapter 16.2 of Shoup's textbook, "A Computational
Introduction to Number Theory and Algebra".
The P vs NP Question: Past, Present, Future
In this talk I will give some history of the great P vs NP question. I will also discuss the problem's current status, some current claims about it, and talk about its future. The talk should be fun for those who are not experts, and also fun for those who are experts (though with the P=NP question, it is unclear who is really an "expert", since we know so little).
Finally, I will outline an approach to the question that I have been actively pursuing for several years now. The end is not in sight, but progress appears possible.