# Theoretical Computer Science Group

## Theory Seminar Schedule for Fall 2009

The weekly Theory seminar provides an opportunity to get exposed to cutting edge research in theoretical computer science, to learn to understand technical material from talks and to give technical presentations. Graduate students interested in doing research in algorithms, complexity theory, cryptography, quantum computing and related areas are strongly encouraged to attend.

Date Speaker Title
Aug 24, 2009 Orientation
Aug 31, 2009 Prasenjit Mitra Query Containment and Rewriting Using Views
Sep 14, 2009 Steve Simpson Algorithmic randomness and Kolmogorov complexity
Sep 21, 2009 Krzysztof Onak Sublinear Graph Approximation Algorithms
Sep 28, 2009 Youngtae Youn A Constructive Proof of Lovasz Local Lemma
Oct 5, 2009 Vinod Vaikuntanathan Leakage-resilient cryptography
Oct 12, 2009 Fang Song Homomorphic Encryption using Ideal Lattices
Oct 19, 2009 Abhradeep Guha Thakurta Privacy-preserving Frequent Itemset Mining
Oct 26, 2009 Madhav Jha Property Preserving Data Reconstruction: Monotonicity.
Nov 2, 2009 Swarat Chaudhuri Art of Invariant Generation (applied to Symbolic Bound Computation)
Nov 9, 2009 Ramamohan Paturi On the complexity of Circuit Satisfiability
Nov 16, 2009 Ge Ruan An O(log n) Approximation Ratio for the Asymmetric Traveling Salesman Path Problem(ATSPP).
Nov 23, 2009 Thanksgiving Break
Nov 30, 2009 T.K.Ramesh Extended Longest Path Problem: Color-coding and Divide-and-color.
Dec 07, 2009 S. Desmond Nathanson Algorithms for the Discrete Logarithm Problem

All talks will be at 3:30 pm on Mondays in IST 333 (unless they are part of the departmental colloquium).

If you would like to speak, please email us at: theory [at] cse [dot] psu [dot] edu.

If you want to receive announcements of upcoming talks, please join the theory mailing list .

Archives

### Query Containment and Rewriting Using Views (Prasenjit Mitra)

In order to improve the performance of a database or implement security policies, often, queries need to be rewritten in terms of previous queries. In this work, we assume the queries are expressed as conjunctive queries (queries posed as a conjunction of database relations) and conjunctive queries with arithmetic comparisons (of the form X \theta Y, where \theta is one of < or <= and X,Y are variables). Traditionally, the bucket algorithm has been used to rewrite queries using a set of existing queries. We show that the bucket algorithm is not complete and present the shared-variable bucket algorithm, which is sound and complete when the queries are conjunctive. The complexity of rewriting queries is closely dependent upon the complexity of checking for query containment, i.e., deciding whether the results produced by one query is contained in results produced another for any given database. When the queries contain arithmetic comparisons, the complexity of query containment is \Pi_2^P-complete. We identify special cases of arithmetic comparisons for which the problem is NP-complete and propose an algorithm for rewriting queries for this restricted class of arithmetic comparisons. We also outline an efficient, heuristic algorithm that can be used to minimize the size of the query rewritings.

### An introduction to Kolmogorov complexity (Steve Simpson)

Let A be a finite mathematical object. For instance, A could be an integer, or a finite sequence of 0's and 1's, or a finite graph, or a finite group. Associated with A is a positive integer K(A), the Kolmogorov complexity of A, which measures the "amount of information" in A, or the minimum length of a "description" of A. More precisely, the Kolmogorov complexity of A is the minimum length of a computer program P which describes A in the sense that, if we run P with no inputs, then P eventually outputs A and halts. In this talk we define Kolmogorov complexity and prove some of its basic properties. We also survey some connections between Kolmogorov complexity and various mathematical topics. Among these topics are algorithmic randomness, Hausdorff dimension, diagonal noncomputability, degrees of unsolvability, tilings of the plane, and 2-dimensional symbolic dynamics.

### Sublinear Graph Approximation Algorithms (Krzysztof Onak)

Can the size of a solution to a graph problem be computed faster than the solution itself? I will show that the answer to this question is positive in many cases. For instance, I will present the first approximation algorithm that for the family of graphs with degree bounded by d, computes the maximum matching size to within epsilon * n in time that only depends on d and epsilon.

I will give an overview of all known results, and I will discuss applications to property testing and local distributed algorithms. A byproduct of our techniques is a simple proof for the theorem of Benjamini, Schramm, and Shapira (STOC 2008) that every minor-closed property of bounded-degree graphs is testable in constant time.

Joint work with Noga Alon, Avinatan Hassidim, Jonathan A. Kelner, and Huy N. Nguyen.

### A constructive proof of Lovasz Local Lemma (Youngtae Youn)

If a large number of events are all independent of one another, then there is a positive probability that none of these events will occur. The Lovasz Local Lemma (LLL) relaxed the independence condition such that each event is independent of all but a small number of other events. In this case, there will still be a positive probability that none of them occurs. Unfortunately, the LLL is non-constructive because it does not indicate how to pick an explicit element of the probability space where no event occurs. There have been many attempts to present a constructive version of LLL under some restricted conditions. Recently, Robin Moser invented a simple and elegant constructive proof of LLL, which earned him the best student paper award and the best paper award at STOC 2009. In this talk, I will explain Moser's constructive proof and its direct application to the satisfiability problem.

### Leakage-resilient cryptography (Vinod Vaikuntanathan)

The absolute privacy of secrets has been the corner-stone of modern cryptography. However, in practice, this assumption is often violated because of information leakage due to side-channel attacks. Side channel attacks such as power-analysis attacks, electromagnetic (EM) attacks, timing attacks and the recent cold-boot attacks'' leak information about the secrets, and have been used to completely break many cryptographic systems in common use. In this talk, I will describe some recent work on mathematically modeling general classes of side-channel attacks, and designing cryptographic schemes secure against side-channels. In particular, I will show various constructions of

* Public-key encryption schemes, and

* Digital signature schemes

that are secure against *general* side-channel attacks.

I will also describe a method of protecting *any* cryptographic computation (including, as a special case, public-key encryption and digital signatures, but encompassing much more) against *specific* classes of side-channel attacks.

Bio:

Vinod Vaikuntanathan is a postdoctoral fellow in the cryptography group at IBM T. J. Watson. He received a Ph.D. from MIT in 2008 under the guidance of Shafi Goldwasser. He is a recipient of the MIT Akamai graduate fellowship, the IBM Josef Raviv Postdoctoral fellowship and the MIT George M. Sprowls Ph.D. thesis award. The focus of his research is cryptography, where his research involves devising new mathematical tools for cryptography, and applying theoretical cryptography to counter practical attacks.

### Fully Homomorphic Encryptions using Ideal Lattices (Fang Song)

Homomorphic Encryption enables us to perform operations on ciphertexts without decrypting them. For example, the (textbook) RSA scheme is homomorphic w.r.t. multiplication (i.e., product of two ciphertexts gives the encryption of the product of the two corresponding plaintexts). Previously, we only knew schemes that are homomorphic under limited type of operations (say, addition or multiplication). Schemes that allow evaluating any circuit on ciphertexts are called fullly homomorphic. The problem of constructing a fully homomorphic scheme has been open for almost 30 years, and many have conjectured that no such scheme exists.

Craig Gentry (STOC'09) resolved this problem based on some hardness assumption on ideal lattices. His approach is roughly two-step: he first showed a generic result that if a scheme could evaluate its own decryption circuit, then we can construct a fully homomorphic scheme from this so called "bootstrappable" scheme. Next, he proposed a bootstrappable encryption scheme using ideal lattices. I will go through the entire construction in my talk, but more effort will be placed on the first step, and I will probably only give high-level ideas about the second step.

### Property Preserving Data Reconstruction: Monotonicity. (Madhav Jha)

Nowadays, it is not unusual to see computations involving massive datasets. Users of such datasets (where user could just be an algorithm accessing the dataset) typically expects the dataset to satisfy certain structural property. Consider, for example, a Binary Search algorithm working on a dataset of list of numbers. The algorithm expects the dataset to be sorted and, in fact, it is a precondition for the correctness of the algorithm. The property of being sorted and many similar properties are sensitive to noise. Even a small amount of corruption could destroy the property. In such settings, one would usually like to modify the dataset minimally to recover the property. Because the underlying dataset is huge (eg. database of customer transactions), even going over the entire data completely once might not be feasible. The field of property testing and sublinear algorithms offers practical, sublinear algorithms for such tasks.

In this context, we seek to design a "filter" that sits between users of the dataset and the dataset itself. The filter accepts queries from user and repairs the dataset "on-the-fly" while answering the queries. We require that the number of "modifications" done by the filter to the dataset to recover the property is never more than a constant (say 2) factor off than what is absolutely necessary in an idealized offline setting. In this talk, I will focus on monotonicity, that is, the property that the dataset is sorted. I will present a sublinear (polylogarithmic) monotonicity reconstruction algorithm based on the paper : .Property-Preserving Data Reconstruction. by Nir Ailon, Bernard Chazelle, Seshadhri Comandur, Ding Liu which appeared in ISAAC (2004).

### Art of Invariant Generation (applied to Symbolic Bound Computation) (Swarat Chaudhuri)

Invariants are key to proofs of correctness of programs, whether manual or automated. The last decade has seen significant developments in the automated generation of program invariants. This talk will discuss three dimensions in the art of invariant generation: (a) Program Transformations, (b) Decision Procedures for Logics, and (c) Fixpoints. Program transformations alleviate the need for employing sophisticated invariant generation tools. Simpler invariants on the transformed program correspond to sophisticated invariants on the original program. Decision procedures are used to check satisfiability/validity of formulas involving arithmetic, uninterpreted functions, arrays, lists/reachability, and combinations of theories. Fixpoints are used to automatically generate loop invariants for a particular shade of logic. Examples include iteration (forward or backward), constraint based, proof-rules, and learning.

A variety of the above-mentioned principles for invariant generation will then be applied to present a solution to the important, but relatively under-studied, problem of computing a precise symbolic bound on the number of visits to a given location in a program.

### On the complexity of Circuit Satisfiability (Ramamohan Paturi)

We are concerned with the exponential complexity of the Circuit Satisfiability (CircuitSat) problem and more generally with the exponential complexity of NP-complete problems. Over the past 15 years or so, researchers have obtained a number of exponential-time algorithms with improved running times for exactly solving a variety of NP-complete problems. The improvements are typically in the form of better exponents compared to exhaustive search. Our goal is to develop techniques to prove specific lower bounds on the exponents under plausible complexity assumptions. We consider natural, though restricted, algorithmic paradigms and prove lower bounds on the exponent of the success probability. Our approach has the advantage of clarifying the relative power of various algorithmic paradigms.

Our main technique is a a success probability amplification technique, called the Exponential Amplification Lemma, which shows that for any f(n,m)-size bounded probabilistic circuit family A that decides CircuitSat with success probability at least $2^{-\alpha n}$ for $\alpha<1$ on inputs which are circuits of size $m$ with $n$ variables, there is another probabilistic circuit family B that decides CircuitSat with size roughly $f(\alpha n, f(m,n))$ and success probability about $2^{-\alpha^2 n}$. In contrast, the standard method for boosting success probability by repeated trials will improve it to $(1-(1-2^{-\alpha n})^t)$ ($\approx t2^{-\alpha n}$ for $t=O(2^{\alpha n}$)) using circuits of size about $tf(n,m)$.

Using this lemma, we derive tight bounds on the exponent of the success probability for deciding the CircuitSat problem in a variety of probabilistic computational models under complexity assumptions. For example, we show that the success probability cannot be better than $2^{-n+o(n)}$ for deciding CircuitSat by probabilistic polynomial size circuits unless CircuitSat (thereby all of NP) on polynomial size instances can be decided by $2^{n^\mu}$ size deterministic circuits for some $\mu <1$, which is considered an unlikely event. As another example, we show that probabilistic quasilinear size circuits cannot achieve success probability better than $2^{-n+o(n)}$ unless CircuitSat (as well as NP) has $O(m^{\lg \lg m})$ size deterministic circuits, which is very close to the statement $\np\in\ppoly$, a highly unlikely scenario.

### An O(log n) Approximation Ratio for the Asymmetric Traveling Salesman Path Problem(ATSPP). (Ge Ruan)

Traveling Salesman Problem (TSP) is a cornerstone problem for combinatorial optimization. It has a variety of applications, and several variants of this problem have been extensively studied.g sophisticated invariant generation tools. Simpler invariants on the transformed program correspond to sophisticated invariants on the original program. Decision procedures are used to check satisfiability/validity of formulas involving arithmetic, uninterpreted functions, arrays, lists/reachability, and combinations of theories. Fixpoints are used to automatically generate loop invariants for a particular shade of logic. Examples include iteration (forward or backward), constraint based, proof-rules, and learning.

In this talk, I will start by introducing TSP and its variants. Then we will focus on one of the variants of TSP: Given a complete directed graph G=(V,E,L), where L is the non-negative length of the edges, and a pair of vertices s,t, find an s-t path of minimum length that visits all the vertices in V. We refer to this problem as Asymmetric Traveling Salesman Path Problem (ATSPP). When s=t it is called Asymmetric Traveling Salesman tour Problem (ATSP).

I will describe Frieze, Galbalti and Maffioli's O(log n) approximation algorithm for ATSP (Networks 12, 1982). This is a starting point for O(log n) approximation algorithm for ATSPP by Chekuri and Pal (Theory of Computing,Volume 3, 2007), which will be the main focus of this talk. This algorithm was a big improvement over previous known O(sqrt(n)) approximation ratio for this problem, proposed by Lam and Newman in 2006.

This talk is based on "An O(log n) Approximation Ratio for the Asymmetric Traveling Salesman Path Problem(ATSPP)" by Chekuri and Pal (Theory of Computing, Volume 3, 2007).

### Extended Longest Path Problem: Color-coding and Divide-and-color. (T.K.Ramesh)

Given an undirected graph G with V vertices and E edges, the Extended Longest Path problem requires us to find all pairs of vertices that have a simple path of length k between them. In this talk, we shall look at three methods for solving this NP-complete problem. The method of Random Orientations, introduced by Alon et al. in 1995, takes random permutations of the vertex set to generate directed acyclic graphs, and thence solves the problem in O( k! poly(V,k)) time. The Color-coding approach, also introduced in the aforementioned paper, colors the vertices using k+1 colors. It aims to find 'colorful paths' in the colored graph (a 'colorful path' being one that has distinct colors for each of its vertices), and solves the original problem in O((2e)^k poly(V,E,k)) time. The Divide-and-Color approach introduced by Kneis et al in 2006, combines the divide-and-conquer approach with color-coding. It colors the vertex set using exactly two colors to obtain an extremely simple O(4^k poly(V,k)) algorithm.

The talk will be based on certain portions of the following papers:

[1] Color-coding (Alon, Yuster, Zwick; 1995)

[2] Divide-and-color (Kneis, Molle, Richter, Rossmanith; 2006)

### Algorithms for the Discrete Logarithm Problem. (S. Desmond Nathanson)

The discrete logarithm problem is an important problem in cryptography. Specifically, if it has some polynomial time algorithm, then popular cryptographic schemes, such as RSA, become breakable. Fortunately, no algorithms are known which can solve the problem in polynomial time. This talk covers algorithms which, run in super-polynomial time but faster than a simple brute force attack. Specifically, it covers the baby step, giant step algorithm discovered by Daniel Shanks, the Pohlig-Hellman algorithm discovered by Roland Silver and first published by Pohlig and Hellman, and lastly the index calculus method, which was discovered independently by Adleman, Merkle, and Pollard.