The weekly Theory seminar provides an opportunity to get exposed to
cutting edge research in theoretical computer science, to learn to
understand technical material from talks and to give technical
presentations. Graduate students interested in doing research in
algorithms, complexity theory, cryptography, quantum
computing and related areas are strongly encouraged to attend.

All talks will be at 3:30 pm on Mondays in IST 333 (unless they
are part of the departmental colloquium).

If you would like to speak,
please email us at: theory [at] cse [dot] psu [dot] edu.

If you want to receive announcements of upcoming talks, please
join the theory
mailing list.

Lower Bounds on Local Monotonicity Reconstruction from Transitive Closure Spanners

Consider an algorithm A computing on a large data set D,
say, an array of numbers. s correctness (and sometimes efficiency)
may be contingent upon the data set satisfying a certain structural
property. For example, A may require that D is sorted. In such
situations, it may be desirable that A access D via a "filter". The
filter ensures that data seen by A always satisfy the required
property, modifying D at few on the if required. This
model of property-preserving data reconstruction was introduced in
[Ailon Chazelle Comandur Liu, Algorithmica 2008]. Extending this
notion, Saks and Seshadhri (SIAM Journal on Computing, 2010) gave a
filter for the property of monotonicity (aka sortedness).
Specifically, given access to an oracle for an almost monotone
function f : [m]^d -> R, the filter can quickly evaluate a related
function g : [m]^d -> R which is guaranteed to be monotone.
Furthermore, the filter can be implemented in a distributed manner.
In this work, we show a new connection between efficiency of a
monotonicity filter and the size of 2-transitive-closure spanner
(2-TC-spanner). The latter were introduced in [Bhattacharyya
Grigorescu Jung Raskhodnikova Woodruff, SODA 2009]. Given a graph G =
(V,E), a graph of H on the same vertex set is a 2-TC-spanner of G if
distance between vertices u and v in H is at most 2 iff G has a path
from u to v. We show that an efficient local monotonicity filter
implies a sparse 2-TC-spanner of the directed hypergrid (hypercube),
providing a new technique for proving lower bounds for such filters.
We present tight upper and lower bounds on the size of the sparsest
2-TC-spanners of the directed hypercube and hypergrid. These bounds
imply tighter lower bounds for local monotonicity filters that nearly
match the known upper bounds.
Joint work with Arnab Bhattacharyya, Elena Grigorescu, Kyomin Jung,
Sofya Raskhodnikova and David Woodruff.

Approximation Algorithms for Min-Max Generalization Problems

We provide improved approximation algorithms for the min-max
generalization problems considered by Du, Eppstein, Goodrich, and
Lueker. In min-max generalization problems, the input consists of data
items with weights and a lower bound w_lb, and the goal is to
partition individual items into groups of weight at least w_lb, while
minimizing the maximum weight of a group. The rules of legal
partitioning are specic to a problem. Du et al. consider several
problems in this vein:

partitioning a graph into connected subgraphs,

partitioning unstructured data into arbitrary classes and

partitioning a 2-dimensional array into non-overlapping contiguous
rectangles (subarrays) that satisfy the above size requirements.

We significantly improve approximation ratios for all the problems
considered by Du et al., and provide additional motivation for these
problems. Moreover, for the first problem, while Du et al. give
approximation algorithms for specific graph families, namely,
3-connected and 4-connected planar graphs, no polynomial time
approximation algorithm that works for all graphs was known prior to
this work.
Joint work with Piotr Berman.

Codes for Computationally Limited Channels

There are, crudely, two popular ways to model the noise that
can occur when transmitting over an imperfect channel: "stochastic"
models, which assume a specific probability distribution on the
patterns of errors, and "adversarial" models, which limit only the
number (or magnitude) of errors that can occur. Codes that can
transmit reliably in the presence of adversarial errors are useful
since they work in a wide variety of conditions. However, designing
such codes is hard since the errors can be tailored to the particular
codeword being transmitted.

Claude Shannon, in his seminal work in information theory, developed
tools for determining the maximal rate at which information can be
transmitted in stochastic models. For example, we know that for the
binary symmetric channel $BSC_p$ which flips each transmitted bit
independently with probability $p$, there exist binary codes of rate
arbitrarily close to $1-H(p)$ that enable reliable information
transmission with exponentially small probability of mistaken
decoding. Here $H(\cdot)$ is the binary entropy function. The quantity
$1-H(p)$ is called the (Shannon) capacity of the $\bsc_p$ channel. But
what if the errors are adversarial? If the channel can corrupt up to a
fraction $p$ of symbols in an arbitrary manner *after* seeing the
codeword, it is known that for error-free communication to be
possible, the rate has to be much smaller than the Shannon capacity
$1-H(p)$.

In this talk, we survey a line of work which considers relaxations of
the model that enable achieving capacity even against adversarial
errors. The talk will focus on models which limit the channel to
*computationally simple behavior* and use ideas from cryptography to
reason about the channel. Such settings were considered in several
previous works, notably those of Lipton (STACS 1994) and Micali et
al.~(TCC 2005); both of those works showed how to achieve the Shannon
capacity in the presence of arbitrary polynomial-time channels, at the
cost of additional setup assumptions (either shared randomness or a
public-key infrastructure).

We conclude with some recent results (joint with V. Guruswami, to
appear at FOCS 2010) on removing these set-up assumptions.

Zero-Knowlede in a Quantum World

Fang will give us a brief introduction to zero-knowledge
protocols, explain why they are problematic if the verifier
("adversary") possesses even very limited quantum computational power,
and sketch a result of Watrous' (STOC 2006) that gives us some initial
tools for addressing the problems.

A Local Heuristic for the k-Medians Problem and
Its Applications

Given a graph G, the goal in the k-medians problem is to
find a set of k distinguished vertices ("centers") which minimize the
sum, over all vertices in G, of the distance from a vertex to its
nearest center. We will look at a local search heuristic for the
k-medians problem and prove that the heuristic provides a
3-approximation to the global optimum (a result due to Arya et al.,
STOC 2001). Afterwards, we will look at how we can use results from
the proof to generate differentially private approximation algorithms
for k-median (following Gupta et al., SODA 2010).

Papers we will discuss:

Vijay Arya, Naveen Garg, Rohit Khandekar, Adam Meyerson, Kamesh
Munagala, Vinayaka Pandit: Local search heuristic for k-median and
facility location problems. STOC 2001, p. 21-29

Anupam Gupta, Katrina Ligett, Frank McSherry, Aaron Roth, Kunal
Talwar: Differentially Private Combinatorial Optimization. SODA 2010,
p. 1106-1125.

A Hands-On Introduction to Automated Theorem-Proving

What is a proof, and what makes some proofs harder to find, write and
understand than others? Swarat won't answer any of these questions
during
the talk. We will, however, acquire some fodder for thinking about
these topics by learning about software that helps a human generate
machine-verifiable proofs. We'll see a few examples of proofs that are
easy to automate, and also get a sense of which types of proofs
require more human help (at least, with this system).

The purpose of this seminar is (a) to educate attendees about an
interesting topic at the intersection of logic and computer science
and (b) take a first step towards developing some course materials to
help
students understand rigorous reasoning.

In order to make it more interactive, we would like as many people as
possible to get the software up and running on their laptops before
the seminar. See instructions below.

Two of the examples we will work through will consist of verifying the
solutions to two short puzzles (below). Other examples will come from
basic discrete math and algorithms.

Local Ratio Approximation Algorithms: An Introduction

The "local ratio" framework for the design of approximation algorithms
simplifies the analysis of many classic approximation algorithms. It
has also led to improvements in the state of the art for many
problems, in terms of both running time and approximation ratio. We'll
see the basic framework and how it applies to example problems in
several domains: partial vertex cover, partial hitting set and
interval scheduling.

Polylogarithmic Approximation for Edit Distance and
the Asymmetric Query Complexity

We present a near-linear time algorithm for approximating the edit
distance between two strings to within a significantly better factor
than previously known. This result emerges in our investigation of
edit distance from a new perspective, namely a model of asymmetric
queries, for which we present near-tight query complexity bounds.
Another consequence of this new model is the first rigorous separation
between edit distance and Ulam distance, which we obtain by tracing
the hardness of edit distance back to a phenomenon that was not used before.

Joint work with Alexandr Andoni and Robert Krauthgamer.

Two-Party Differential Privacy and Deterministic Extraction from Santha-Vazirani Sources

The talk presents recent results (from a paper at FOCS 2010)
on "differential privacy" in a distributed setting: two parties
holding sensitive data would like to analyze the union of their data
sets while preserving the privacy of the individuals whose data they
hold.

The paper give almost tight bounds on the accuracy of such analysis
for the Hamming distance function. These bounds show the contrast
between information theoretic and computational differential privacy
in the two-party setting and also between the two-party and the
client-server setting. The proof technique introduces a connection
between differential privacy and deterministic extraction from
independent Santha-Vazirani sources.

Introduction to Fourier Analysis of Boolean Functions

In this talk we give a brief introduction to Fourier
analysis of boolean functions. We first discuss some the basic
concepts like Parseval's theorem, bias, influence, energy, and noise
stability. We then move on to discuss Condorcet's voting paradox and
Arrow's Theorem (which says that dictatorship is the only way to avoid
Condorcet's paradox). We also discuss Kalai's Fourier analytic proof
of Arrow's theorem. We follow the proof with a brief discussion about
the "robustness" of his proof.

(Adapted from the tutorial talk by Ryan O'Donnell at STOC'08)

Computing Market Equilibrium Prices for Linear Fisher
Markets

Consider a Fisher market where there are n buyers with fixed budget
and m divisible goods. Given prices for the goods, each buyer wants to
buy a bundle of goods that maximizes her utility. The market is said
to be at an equilibrium when each buyer is allocated with a
utility-maximizing bundle of goods and there is no deficit or surplus
of goods. Arrow and Debreu proved the equilibrium price exists for a
wide range of markets under a mild assumption, but polynomial-time
algorithms for finding such a price have drawn attention only
relatively
recently. In this talk, we will see the basic framework to find such a
price in
polynomial-time for a special case of markets with linear utility.

The talk is mostly based on the paper "Improved Algorithms for
Computing Fisher's Market Clearing Prices" by James B. Orlin in STOC
2010.

Finding, Minimizing and Counting Weighted Subgraphs

We will present several recent algorithms of Vassilevska and
Williams (STOC 2009) for finding minimum-weight copies of a given
graph H in a larger graph G with node and/or edge weights.
Specifically, we'll discuss a general reduction from the problem of
finding arbitrary graphs H to the problem of finding triangles of a
given weight. We will also present two algorithms for finding weighted
triangles; one of the algorithms works well in dense graphs and the
other, in sparse ones.

Indexing Feature Selection for Graph Querying

Graph data are becoming more common in many scientific and commercial
fields. One of the most important computational problems with
databases of graphs is *subgraph querying*: given a particular query
graph and a database, the goal is to return all graphs in the database
that are supergraphs of the query. "Feature-based graph indexes" are
commonly used to filter out all but a (hopefully) small set of
candidate graphs which are then verified with subgraph isomorphism
tests.

In this talk, I will explain the motivation and basic framework for
these indexes and
discuss several algorithms for selecting the features used to build
indexes. I will explain the principles behind the algorithms' design
and compare their empirical performance.