## Spring 2023

Date | Speaker | Talk Title |

January 23 | – | Student Research Lunch |

January 30 | – | Lunch |

February 6 | – | Student Research Lunch |

February 13 | Maryam Aliakbarpour | Statistical inference with privacy and computational constraints |

February 21 | – | Student Research Lunch |

February 27 | – | Student Research Lunch |

March 6 | – | Spring Break |

March 13 | – | Lunch |

March 20 | – | – |

March 27 | – | Student Research Lunch |

April 3 | NadyaVoronova | Approximate degree lower bounds for oracle identification problems |

April 10 | Jessie Finocchiaro | Designing convex and consistent surrogates via loss embeddings |

April 19 | – | – |

April 24 | Satchit Sivakumar | Stability is Stable: Connections between Replicability, Privacy, and Adaptive Generalization |

May 1 | Jin-Yi Cai | Quantum isomorphism, Planar graph hom*omorphism, and complexity dichotomy |

### Abstracts

**Date: **May 1 2023

**Speaker:** Jin-Yi Cai

**Title:**Quantum isomorphism, Planar graph hom*omorphism, and complexity dichotomy

**Abstract: **I will discuss some recent results on the following inter-related topics:

1. a connection between quantum isomorphism and counting constraint satisfaction

problems #CSP on planar instances;

2. a complexity dichotomy for planar graph hom*omorphisms for domain 3.

Based on some joint work with Ashwin Maran and Ben Young

**Date: **April 24 2023

**Speaker:** Satchit Sivakumar

**Title: **Stability is Stable: Connections between Replicability, Privacy, and Adaptive Generalization

**Abstract: **In this talk, we will discuss a recent definition of replicability as a property of algorithms, introduced with the goal of formulating solutions to the ongoing ‘crisis of replicability’ in empirical science. One of the main purposes of this definition is to allowfor the efficient verification of the results of statistical analyses. Replicability is one of many algorithmic stability notions aimed at ensuring the safety of modern data analyses- others have played central roles in mature fields such as differential privacy and adaptive data analysis. We describe connections and separations between many of these notions of algorithmic stability- giving (tight) sample-efficient algorithmic reductions between replicability, approximate differential privacy, and perfect generalization. Conversely, we show that such equivalences must break down computationally. Our statistical reductions give a new framework for translating between stability notions, which we use to answer several open questions in replicability and privacy. This includes giving sample-efficient replicable algorithms for various statistical problems, asymptotic amplification of $\delta$ in differential privacy, conversions from item-level to user-level privacy, and private agnostic-to-realizable learning reductions under structured distributions.

Based on joint work with Mark Bun, Marco Gaboardi, Max Hopkins, Russell Impagliazzo, Rex Lei, Toniann Pitassi, and Jess Sorrell.

**Date: **April 10 2023

**Speaker:** Jessie Finocchiaro

**Title: **Designing convex and consistent surrogates via loss embeddings

**Abstract: **We formalize and study the natural approach of designing convex surrogate loss functions via embeddings, for problems such as classification, ranking, or structured prediction. In this approach, one embeds each of the finitely many predictions (e.g. rankings) as a point in the d-dimensional reals, assigns the original loss values to these points, and “convexifies” the loss in some way to obtain a surrogate. We establish a strong connection between this approach and polyhedral (piecewise-linear convex) surrogate losses: every discrete loss is embedded by some polyhedral loss, and every polyhedral loss embeds some discrete loss. Moreover, an embedding gives rise to a consistent link function as well as linear surrogate regret bounds. Our results are constructive, as we illustrate with several examples. In particular, our framework gives succinct proofs of consistency or inconsistency for various polyhedral surrogates in the literature, and for inconsistent surrogates, it further reveals the discrete losses for which these surrogates are consistent. We go on to show additional structure of embeddings, such as the equivalence of embedding and matching Bayes risks, and the equivalence of various notions of non-redudancy. Using these results, we establish that indirect elicitation, a necessary condition for consistency, is also sufficient when working with polyhedral surrogates. Based on joint work with Raf Frongillo and Bo Waggoner.

TLDR: Designing “nice” loss functions for discrete prediction tasks is typically ad hoc. We give a framework for designing convex and statistically consistent losses for these tasks and demonstrate its application for tasks like top-k prediction and image segmentation.

**Date: **April 3 2023

**Speaker:** NadyaVoronova

**Title:**Approximate degree lower bounds for oracle identification problems

**Abstract:**The approximate degree of a Boolean function is the minimum degree of real polynomial that approximates it pointwise. For any Boolean function, its approximate degree serves as a lower bound on its quantum query complexity, and generically lifts to a quantum communication lower bound for a related function.

We introduce a framework for proving approximate degree lower bounds for certain oracle identification problems, where the goal is to recover a hidden binary string x ∈ {0, 1}^n given possibly non-standard oracle access to it. Using this framework, we prove approximate degree lower bounds for ordered search and hidden string problems of Ω(n/log^2 n) for each. Our result for the ordered search problem settles (up to polylog n factor) the complexity of a fundamental computational problem (searching an ordered list) in a natural and important model (polynomial degree), while the result for the hidden string problem gives a more natural proof for already existing lower bound on quantum query complexity.

Based on joint work with Mark Bun.

https://arxiv.org/pdf/2303.03921.pdf

**Date: **January 23rd 2023

**Speaker:** Maryam Aliakbarpour

**Title:**Statistical inference with privacy and computational constraints

**Abstract: **The vast amount of digital data we create and collect has revolutionized many scientific fields and industrial sectors. Yet, despite our success in harnessing this transformative power of data, computational and societal trends emerging from the current practices of data science necessitate upgrading our toolkit for data analysis.

In this talk, we discuss how practical considerations such as privacy and memory limits affect statistical inference tasks. In particular, we focus on two examples:

First, we consider hypothesis testing with privacy constraints. More specifically, how one can design an algorithm that tests whether two data features are independent or correlated with a nearly-optimal number of data points while preserving the privacy of the individuals participating in the data set.

Second, we study the problem of entropy estimation of a distribution by streaming over i.i.d. samples from it. We determine how bounded memory affects the number of samples we need to solve this problem.

## Fall 2022

Date | Speaker | Talk Title |

September 12 | No Speaker | Social Lunch! |

September 19 | Anurag Anshu | NLTS Hamiltonians from good quantum codes |

September 26 | Nicole Wein | Closing the Gap Between Directed Hopsets and Shortcut Sets |

October 3 | Sam Hopkins | The Sum of Squares Method & Applications to Differentially-Private Statistics |

October 11 (Tue) | Tolik Zinovyev | Space-stretch tradeoff in routing revisited |

October 17 | Dor Minzer | On Approximability of CSPs on Satisfiable Instances |

October 24 | Tamalika Mukherjee | How to Make Your Approximation Algorithm Private: A Black-Box Differentially-Private Transformation for Tunable Approximation Algorithms of Functions with Low Sensitivity |

October 31 | Santoshini Velusamy | Approximating CSPs in the streaming setting |

November 7 | Jad Silbak | Explicit uniquely decodable codes for computationally bounded channels that achieve BSC capacity |

November 14 | Jane Lange | Properly Learning Monotone Functions via Local Reconstruction |

November 21 | Ta Duy Nguyen | Adaptive methods for convex optimization for unconstrained problems |

November 28 | – | No speaker; group lunch |

December 5th | Jessie Joan Finocchiaro | Designing convex and consistent surrogates via loss embeddings (canceled) |

### Abstracts

**Date: **December 5th 2022

**Speaker: **Jessie Joan Finocchiaro

**Title: **Designing convex and consistent surrogates via loss embeddings.

**Abstract: **We formalize and study the natural approach of designing convex surrogate loss functions via embeddings, for problems such as classification, ranking, or structured prediction. In this approach, one embeds each of the finitely many predictions (e.g. rankings) as a point in the d-dimensional reals, assigns the original loss values to these points, and “convexifies” the loss in some way to obtain a surrogate. We establish a strong connection between this approach and polyhedral (piecewise-linear convex) surrogate losses: every discrete loss is embedded by some polyhedral loss, and every polyhedral loss embeds some discrete loss. Moreover, an embedding gives rise to a consistent link function as well as linear surrogate regret bounds. Our results are constructive, as we illustrate with several examples. In particular, our framework gives succinct proofs of consistency or inconsistency for various polyhedral surrogates in the literature, and for inconsistent surrogates, it further reveals the discrete losses for which these surrogates are consistent. We go on to show additional structure of embeddings, such as the equivalence of embedding and matching Bayes risks, and the equivalence of various notions of non-redudancy. Using these results, we establish that indirect elicitation, a necessary condition for consistency, is also sufficient when working with polyhedral surrogates. Based on joint work with Raf Frongillo and Bo Waggoner.

**Date: **November 21st 2022

**Speaker: **Ta Duy Nguyen

**Title: **Adaptive methods for convex optimization for unconstrained problems.

**Abstract: **Existing analysis of AdaGrad and other adaptive methods for smooth convex optimization is typically for functions with bounded domain diameter. In unconstrained problems, previous works guarantee an asymptotic convergence rate without an explicit constant factor that holds true for the entire function class. In this talk I will demonstrate new techniques to explicitly bound the convergence rate of the scalar variant of AdaGrad, also known as AdaGrad-Norm, for unconstrained problems in both deterministic and stochastic settings. I will also introduce new variants of AdaGrad-Norm for which we can show the convergence of the last iterate with acceleration in the deterministic setting, improving upon asymptotic rates shown in previous works.

Based on joint work with Alina Ene, Huy Nguyen, and Zijian Liu.

**Date: **November 14th 2022

**Speaker: **Jane Lange

**Title: **Properly Learning Monotone Functions via Local Reconstruction

**Abstract: **We give a 2^{O(sqrt(n)/eps)}-time algorithm for properly learning monotone Boolean functions under the uniform distribution over {0,1}^{n}. Our algorithm is robust to adversarial label noise and has a running time nearly matching that of the state-of-the-art improper learning algorithm of Bshouty and Tamon [BT96] and an information-theoretic lower bound of [BCO+15]. Prior to this work, no proper learning algorithm with running time smaller than 2 ^{Ω}^{(n)}was known to exist. The core of our proper learner is a local computation algorithm for sorting binary labels on a poset. Our algorithm is built on a body of work on distributed greedy graph algorithms; specifically, we rely on a recent work of Ghaffari and Uitto [GU19], which gives an efficient algorithm for computing maximal matchings in a graph in the LCA model of [RTVX11, ARVX11]. The applications of our local sorting algorithm extend beyond learning on the Boolean cube: we also give a tolerant tester for Boolean functions over general posets that distinguishes functions that are eps/3-close to monotone from those that are eps-far. Previous tolerant testers for the Boolean cube only distinguished between eps/sqrt(n)-far and eps-far.

**Speaker: **Jad Silbak

**Title: **Explicit uniquely decodable codes for computationally bounded channels that achieve BSC capacity

**Abstract: **Guruswami and Smith (J. ACM 2016) considered codes for channels that are computationally bounded which modify at most a p-fraction of the bits of the codeword. This class of channels is significantly stronger than Shannon’s binary symmetric channel which flips each bit independently with probability p, but weaker than Hamming’s channels which may flip any p-fraction of bits and are computationally unbounded.

Recently, there has been a growing body of work that aims to construct codes against channels that are computationally bounded (e.g., bounded memory channels, channels that are poly-size circuits). In this talk I will explain this direction, focusing on a recent results by Shaltiel and Silbak (STOC 2021, FOCS 2022) that consider channels with bounded memory and bounded size (respectively), with codes that:

– Achieve optimal rate of 1-H(p) (matching the rate of binary symmetric channels, and beating the rate of Hamming channels).

– Are explicit, meaning that they have poly-time encoding and decoding algorithms.

Our techniques rely on ideas from coding theory and pseudorandomness. Key ingredients in our construction are:

-A new notion of “evasiveness” of codes, which is concerned with whether a decoding algorithm for say, binary symmetric channels, rejects a word that is obtained when a computationally bounded channel induces few errors to a uniformly chosen (or pseudorandom) string.

-A new notion of small set non-malleable codes, that is a new variant of the celebrated notion of non-malleable codes (introduced by Dziembowski, Pietrzak and Wichs (J. ACM 2018)), which is tailored for our specific application, and may be of independent interest.

This is a joint work with Ronen Shaltiel.

**Date: **October 31st 2022

**Speaker: **Santoshini Velusamy

**Title:**Approximating CSPs in the streaming setting

**Abstract:**Constraint satisfaction problems (CSPs) are ubiquitous in computer science and encompassoptimization problems such as Max-CUT, Max-DICUT, Max-k-SAT, Max-q-Coloring, Unique Games, etc.Until recently, Max-CUT and a couple of other Max-2CSPs were the only CSPs studied in the streaming setting. In this talk, I will first describe our results giving new upper and lower bounds on the approximability of every CSP in the single-pass streaming setting. In particular, we prove the following theorem: For every CSP, there is a constant ⍺ such that the CSP is ⍺-approximable in polylogarithmic space by a linear sketching algorithm and any sketching algorithm that beats ⍺-approximation requires polynomial space. I will conclude with some interesting results from recent follow-up works and also discuss open problems in this direction.

**Date: **October 24th 2022

**Speaker:** Tamalika Mukherjee

**Title: **How to Make Your Approximation Algorithm Private: A Black-Box Differentially-Private Transformation for Tunable Approximation Algorithms of Functions with Low Sensitivity

**Abstract: **We develop a framework for efficiently transforming certain approximation algorithms into differentially-private variants, in a black-box manner. Our results focus on algorithms A that output an approximation to a function f of the form (1−a)f(x)−k<=A(x)<=(1+a)f(x)+k, where 0<=a <1 is a parameter that can be“tuned” to small-enough values while incurring only a poly blowup in the running time/space. We show that such algorithms can be made DP without sacrificing accuracy, as long as the function f has small global sensitivity. We achieve these results by applying the smooth sensitivity framework developed by Nissim, Raskhodnikova, and Smith (STOC 2007).

Our framework naturally applies to transform non-private FPRAS (resp. FPTAS) algorithms into (ϵ,δ)-DP (resp. ϵ-DP) approximation algorithms. We apply our framework in the context of sublinear-time and sublinear-space algorithms, while preserving the nature of the algorithm in meaningful ranges of the parameters. Our results include the first (to the best of our knowledge) (ϵ,δ)-edge DP sublinear-time algorithm for estimating the number of triangles, the number of connected components, and the weight of a MST of a graph, as well as a more efficient algorithm (while sacrificing pure DP in contrast to previous results) for estimating the average degree of a graph. In the area of streaming algorithms, our results include (ϵ,δ)-DP algorithms for estimating L_p-norms, distinct elements, and weighted MST for both insertion-only and turnstile streams. Our transformation also provides a private version of the smooth histogram framework, which is commonly used for converting streaming algorithms into sliding window variants, and achieves a multiplicative approximation to many problems, such as estimating L_p-norms, distinct elements, and the length of the longest increasing subsequence.

**Date: **October 17th 2022

**Speaker:** Dor Minzer

**Title: **On Approximability of CSPs on Satisfiable Instances

**Abstract: **Constraint Satisfaction Problems (CSPs) are among the most well-studied problems in Computer Science, 3SAT being a prominent example.

For each predicate P, the corresponding satisfiability problem CSP(P) is to determine, given a list of constraints of the form given by P, whether there’s an assignment to the variables satisfying all constraints or not. The recently proven Dichotomy Theorem of Bulatov and Zhuk asserts that for every predicate, this problem is either in the class P, or is NP-complete. A natural question arises: does a similar dichotomy behavior occur for approximation problems?

Namely, for a predicate P for which CSP(P) is NP-complete, is there a threshold α<1 such that finding an assignment satisfying α-fraction of the constraints is computationally easy, while satisfying (α+ε) fraction is hard? This natural question, surprisingly, is wide open, though such thresholds are known for some specific predicates (e.g., for 3SAT).

The talk will present recent works initiating a study of this question, as well as connections to the analogous question for the almost satisfiable case.

Based mainly on joint works with Amey Bhangale and Subhash Khot.

**Date:** (Tuesday) October 11th 2022

**Speaker:** Tolik Zinovyev

**Title:** Space-stretch tradeoff in routing revisited

**Abstract: **We present several new proofs of lower bounds for the space-stretch tradeoff in labeled network routing. First, we give a new proof of an important result of Cyril Gavoille and Marc Gengler that any routing scheme with stretch < 3 must use Ω(n) bits of space at some node on some network with n vertices, even if port numbers can be changed. Compared to the original proof, our proof is significantly shorter and, we believe, conceptually and technically simpler. A small extension of the proof can show that, in fact, any constant fraction of the n nodes must use Ω(n) bits of space on some graph. Our main contribution is a new result that if port numbers are chosen adversarially, then stretch < 2k + 1 implies some node must use Ω( n^(1/k) log(n) ) bits of space on some graph, assuming a girth conjecture by Erdős. We conclude by showing that all known methods of proving a space lower bound in the labeled setting, in fact, require the girth conjecture.

**Date: **October 3rd 2022

**Speaker:** Sam Hopkins

**Title:** The Sum of Squares Method & Applications to Differentially-Private Statistics

**Abstract: **I will discuss the Sum of Squares method for algorithm design in the context of several recent advances in algorithmic statistics. In the latter half of the talk, I will describe some recent applications to designing polynomial-time algorithms for private statistics in high dimensions with near-optimal sample/accuracy/privacy tradeoffs.

**Date: **September 26th 2022

**Speaker:** Nicole Wein

**Title:** Closing the Gap Between Directed Hopsets and Shortcut Sets

**Abstract:** A*shortcutset*is a (small) set of edges that when added to a directed graph G produces a graph with the same reachabilitystructure as G while reducing thediameteras much as possible. A hopset is defined similarly but for approximate distances instead of reachability. One motivation for such structures is that in many settings (e.g. distributed, parallel, dynamic, streaming), computation is faster when the diameter of the graph is small. Thus, to compute, for instance, st-reachability in such a setting, it is useful to find a shortcut set as a preprocessing step, and then run an st-reachability algorithm on the resulting graph.

This talk is about existential bounds for shortcut sets and hopsets: given a budget of ~O(n) added edges, what is the smallest diameter achievable? A folklore algorithm says that diameter O(n^{1/2}) is possible for any graph. A recent breakthrough of Kogan and Parterimproves this to O(n^{1/3}) for shortcut sets and O(n^{2/5}) for hopsets. In recent work, we ask whether hopsets (approximate distances) are trulyharder than shortcut sets (reachability). We close the gap between hopsets and shortcut sets by providing an O(n^{1/3})-diameter construction for hopsets.

Based on joint work with Aaron Bernstein

**Date: **September 19th 2022

**Speaker:** Anurag Anshu

**Title: **NLTS Hamiltonians from good quantum codes

**Abstract: **The NLTS (No Low-Energy Trivial State) conjecture of Freedman and Hastings [2014] posits that there exist families of Hamiltonians with all low energy states of non-trivial complexity (with complexity measured by the quantum circuit depth preparing the state). Our recent workhttps://arxiv.org/abs/2206.13228 (with Nikolas Breuckmann and Chinmay Nirkhe) proves this conjecture by showing that the recently discovered families of constant-rate and linear-distance QLDPC codes correspond to NLTS local Hamiltonians. This talk will provide background on the result, its relevance to quantum complexity theory, and touch upon the proof techniques.

## Spring 2022

Date | Speaker | Talk Title |

January 31 | Alina Ene | Extra-gradient algorithms for min-max optimization and beyond |

February 7 | Rathin Desai | Learning vs Refutation |

February 14 | THEORY SOCIAL | |

February 22 | Abhishek Shetty | Matrix Discrepancy via Quantum Communication |

February 28 | Ludmila Glinskih | The Complexity of Verifying Boolean Programs as Differentially Private |

March 7 | SPRING BREAK | |

March 14 | Mitali Bafna | Playing Unique Games on Certifiable Small-Set Expanders and Beyond |

April 11 | Clayton Sanford | On the Approximation Power of Two-Layer Networks of Random ReLUs |

April 20 | Jakub Łacki | Scaling up Hierarchical Agglomerative Clustering toTrillion-Scale Datasets |

April 25 | Ankur Moitra | Rethinking Robustness: From Classification to Contextual Bandits |

May 2 | Ilya Volkovich | Deterministic Factorization of Sparse Polynomials with Bounded Individual Degree |

May 9 | Kasper Green Larsen | Towards Optimal Lower Bounds for k-means Coresets |

### Abstracts

**Speaker:** Alina Ene

**Title:** Extra-gradient algorithms for min-max optimization and beyond

**Abstract: **In this talk, we give a gentle introduction to first-order algorithms for solving a general class of optimization problems, called variational inequalities with monotone operators. Variational inequalities are a general framework that captures many problems of interest, notably convex optimization and convex-concave saddle point problems. We will discuss classical algorithms based on Extra-gradient as well as more recent progress on making these methods adaptive to unknown problem parameters, such as smoothness.

The latter part of the talk is based on joint work with Huy Nguyen (Northeastern), available here: https://arxiv.org/abs/2010.07799

**Speaker:** Rathin Desai

**Title:** Learning vs Refutation

**Abstract: **In this talk, we discuss the computational hardness of PAC learning, specifically the connection between computationally efficient PAC learning and refutation of constraint satisfaction problems. We will discuss the equivalence between random-right-hand-side (RRHS) refuting and PAC learning. Additionally, time permitting, we will discuss an application of these techniques to show computational hardness of learning DNFs.

This talk is based on the following papers:

1) On Learning versus Refutation – Salil Vadhan, COLT 2017

2) From average case complexity to improper learning complexity – Amit Daniely, Nati Linial, Shai Shalev-Shwartz, STOC 2014

3) Agnostic Learning by Refuting – Pravesh Kothari, Roi Livni, ITCS 2018

**Speaker:** Abhishek Shetty

**Title:** Matrix Discrepancy via Quantum Communication

**Abstract: **In this talk, we will discuss a novel connection between discrepancy minimization and (quantum) communication complexity. As an application, we resolve a substantial special case of the Matrix Spencer conjecture. In particular, we show that for every collection of $n$ $n \times n$ symmetric matrices $A_1 \dots A_n$ with spectral norm bounded by 1 and Frobenius norm bounded by$n^{1/4}$, there is a signing $x$ such that $|| \sum x_i A_i|| \leq \sqrt{n}$ We give a polynomial-time algorithm based on partial coloring and semidefinite programming to find such a sign.

The proof of our main result combines a simple compression scheme for transcripts of repeated (quantum) communication protocols with quantum state purification, the Holevo bound from quantum information, and tools from sketching and dimensionality reduction. Our approach also offers a promising avenue to resolve the Matrix Spencer conjecture completely — we show it is implied by a natural conjecture in quantum communication complexity.

The talk is based on joint work with Sam Hopkins (MIT) and Prasad Raghavendra (UC Berkeley), to appear at STOC 2022 and available here: https://arxiv.org/abs/2110.10099

**Speaker:** Ludmila Glinskih

**Title:** The Complexity of Verifying Boolean Programs as Differentially Private

**Abstract: **In this talk I will present our recent work on the computational complexity of verifying whether programs are differentially private. We study this problem for while-like programs working over boolean data and making probabilistic choices. Programs in this class can be interpreted into finite-state discrete-time Markov Chains (DTMC). We show that the problem of deciding whether a program is differentially private for specific values of the privacy parameters is PSPACE-complete.

I will also briefly discuss similar complexity results for several relaxations of differential privacy: Rényi differential privacy, concentrated differential privacy, and truncated concentrated differential privacy. For these notions, we consider gapped promise versions of the privacy verification problem and we show that all of them are PSPACE-complete.

This is a joint work with Mark Bun and Marco Gaboardi. It will appear at CSF’22.

**Speaker: **Mitali Bafna

**Title: **Playing Unique Games on Certifiable Small-Set Expanders and Beyond

**Abstract: **The Unique Games Conjecture (UGC) is a central open question in computational complexity and algorithms. In short, the UGC stipulates that distinguishing between almost satisfiable and highly unsatisfiable instances of a certain 2-variable constraint satisfaction problem (CSP) called Unique Games is NP-hard. We build algorithms for UG on a large class of restricted instances including certified small-set expanders and Johnson graphs. In doing so we give new tools to analyze Sum-of-Squares SDPs and connect the UGC to another important complexity theoretic conjecture, the Small-Set-Expansion Hypothesis. In another work we prove structural inequalities for high-dimensional expanders and give UG algorithms on these graphs.

This talk is based on the joint works: https://arxiv.org/abs/2006.09969 and https://arxiv.org/abs/2011.04658.

**Speaker: **Clayton Sanford

**Title: **On the Approximation Power of Two-Layer Networks of Random ReLUs

**Abstract: **This talk discusses the following question: how well can depth-two ReLU networks with randomly initialized bottom-level weights represent smooth functions? We give near-matching upper and lower bounds for L2-approximation in terms of the Lipschitz constant, the desired accuracy, and the dimension of the problem, as well as similar results in terms of Sobolev norms. Our positive results employ tools from harmonic analysis and ridgelet representation theory, while our lower-bounds are based on (robust versions of) dimensionality arguments. I’ll discuss these results and their implications to depth separation and the hardness of learning sinusoidal functions with gradient descent with deeper neural networks.

This talk is based on a COLT 2021 paper: https://arxiv.org/abs/2102.02336

**Speaker: **Jakub Łącki

**Title:** Scaling up Hierarchical Agglomerative Clustering to Trillion-Scale Datasets

**Abstract: **Hierarchical agglomerative clustering (HAC) is a simple and widely popular clustering method known for its high empirical quality. However, the applications of HAC on large datasets have been limited by the high computational cost of the existing implementations. Addressing this limitation has been an elusive task, as the algorithm has been believed to be inherently sequential and computationally expensive. In this talk we study the HAC problem on edge-weighted graphs assuming the widely-used average linkage similarity function. We first give conditional hardness results showing that HAC requires time that is super-linear in the input size, and cannot be parallelized efficiently, formalizing the commonly held intuitions about the algorithm. We then show how both of these limitations can be bypassed by allowing approximation. Namely, we show that 1+ε approximate HAC can be implemented in near-linear work and polylogarithmic depth. Finally, we show that our ideas lead to highly scalable HAC implementations. In a shared memory parallel setting (i.e., on a single machine) we obtain an over 50x speedup over the best existing baseline. Moreover, in the distributed setting, we demonstrate that our implementation scales to graphs containing trillions of edges.

**Speaker:** Ankur Moitra

**Title:** Rethinking Robustness: From Classification to Contextual Bandits

**Abstract:** What does it mean for a learning algorithm to be robust? We could hope to tolerate adversarial corruptions, but this makes many simple learning tasks computationally intractable. We could assume a particular stochastic model for noise, but then our algorithms might be overtuning to this one distribution, which was only meant to be a simplifying assumption.

In this talk we will revisit some classic learning problems, like learning halfspaces, online regression and contextual bandits, in models that blend worst-case and average-case noise. We will give simple twists on old algorithms that allow them to enjoy strong provable robustness guarantees. Finally we explore some intriguing connections between robustness and fairness.

**Speaker:** Ilya Volkovich

**Title:** Deterministic Factorization of Sparse Polynomials with Bounded Individual Degree

**Abstract:** We study the problem of deterministic factorization of sparse polynomials. We show that if f \in \F[x1,…,xn] is a polynomial with s monomials, with individual degrees of its variables bounded by d, then f can be deterministically factored in time s^{O(\poly(d)log n)}.

Prior to our work, the only efficient factoring algorithms known for this class of polynomials were randomized, and other than for the cases of d=1 and d=2, only exponential time deterministic factoring algorithms were known. A crucial ingredient in our proof is a quasi-polynomial sparsity bound for factors of sparse polynomials of bounded individual degree. In particular we show if f is an s-sparse polynomial in n variables, with individual degrees of its variables bounded by d, then the sparsity of each factor of f is bounded by s^{O(d^2 logn)}. This is the first nontrivial bound on factor sparsity for d>2. Our sparsity bound uses techniques from convex geometry, such as the theory of Newton polytopes and an approximate version of the classical Caratheodory’s Theorem.

Our work addresses and partially answers a question of von zur Gathen and Kaltofen (JCSS 1985) who asked whether a quasi-polynomial bound holds for the sparsity of factors of sparse polynomials. This is joint work with Vishwas Bhargava and Shubhangi Saraf.

**Speaker:** Kasper Green Larsen

**Title:** Towards Optimal Lower Bounds for k-means Coresets

**Abstract:** Given a set of points in Euclidian space, the k-means problem consists of finding a set of k points called centers, such that the sum of distances squared of every data point to its closest center is minimized. The k-means problems is at the heart of modern data analysis and massive data applications have given rise to the notion of a coreset: a small (weighted) subset of the input point set preserving the cost of any solution to the problem up to a multiplicative (1 + ε) factor, hence reducing from large to small scale the input to the problem.

While there has been an intensive effort to understand what is the best coreset size possible, there is still a significant gap between the state- of-the-art upper and lower bounds. In this talk, we make progress by showing an Omega(k/eps^2) lower bound. This improves over Omega(k/sqrt(eps)) by Baker, Braverman, Huang, Jiang, Krauthgamer and Wu from ICML’20 and nearly matches a new upper bound of O(min(k^2/eps^2,k/eps^3)).

The talk is based on joint work with Vincent Cohen-Addad, David Saulpic and Chris Schwiegelshohn and was accepted to STOC’22 and can be found here: https://arxiv.org/abs/2202.12793

## _____________________________________________________

## Fall 2021

Date | Speaker | Talk Title |

September 13 | Marco Carmosino | LEARN-Uniform Circuit Lower Bounds and Provability in Bounded Arithmetic |

September 20 | Eylon Yogev (Bar-Ilan University) | SubquadraticSNARGsin the Random Oracle Model |

September 27 | Ari Kachmer | Covert Learning: How to Learn withan Untrusted Intermediary |

October 4 | Jie Wang (UMass Lowell) | Contextual Networks and Unsupervised Ranking of Text |

October 12 | Talya Eden | Approximating the Arboricity in Sublinear Time |

October 18 | Kyle Burke (Plymouth State University) | Undirected Geography Nimbers are PSPACE-Hard |

October 25 | Chara Podimata (Harvard University) | Contextual Search in the Presence of Irrational Agents |

November 1 | Lydia Zakynthinou (Northeastern University) | Differentially Private Covariance-Aware Mean Estimation |

November 8 | Alon Eden (Harvard University) | Private Interdependent Valuations |

November 15 | Maryam Aliakbarpour | Testing Properties of Multiple Distributions with Few Samples |

November 22 | Iden Kalemaj | Sublinear-Time Computation in the Presence of Online Erasures |

November 29 | Mahsa Derakhshan (Princeton University) | Approximation algorithms for Maximum Matching and Minimum Vertex Cover on Stochastic Graphs |

December 6 | Nathan Harms (University of Waterloo) | VC Dimension and Distribution-Free Sample-Based Testing |

December 13 | Ronitt Rubinfeld (MIT) | Locality in Computation |

### Abstracts

**Speaker:** Marco Carmosino

**Title: LEARN-Uniform Circuit Lower Bounds and Provability in Bounded Arithmetic**

**Abstract:**

We investigate randomized LEARN-uniformity, which captures the power of randomness and equivalence queries (EQ) in the construction of Boolean circuits for an explicit problem. This is an intermediate notion between P-uniformity and non-uniformity motivated by connections to learning, complexity, and logic. We establish the first unconditional lower bounds against LEARN-uniform circuits.

We employ these LEARN-uniform circuit*lower bounds*to investigate the provability of non-uniform circuit*upper bounds*in formal systems that capture “efficient” reasoning: theories of bounded arithmetic. By extracting computational information from proofs via a direct translation of “proof” into “LEARN-uniformity”, we establish robust unprovability theorems that unify, simplify, and extend nearly all previous results (Krajicek-Oliveira (2017), Muller-Bydzovsky (2020), and Bydzovsky-Krajicek-Oliveira (2020)).

Joint work with Valentine Kabanets, Antonina Kolokolova, and Igor C. Oliveira

**Speaker:** Eylon Yogev (Bar-Ilan University)

**Title: SubquadraticSNARGsin the Random Oracle Model**

**Abstract:**

In a seminal work, Micali (FOCS 1994) gave the first succinct non-interactive argument (SNARG) in the random oracle model (ROM). The construction combines a PCP and a cryptographic commitment and has many attractive features. However, it also has a significant drawback: a large argument size.

In this talk, I will describe a new construction that achieves a smaller argument size. This is the first progress on the Micali construction since it was introduced over 25 years ago.

Our construction relies on a strong soundness notion for PCPs and a weak binding notion for commitments. We hope that our work paves the way for understanding if a linear argument size is achievable in the ROM.

Joint work withAlessandro Chiesa

**Speaker:** Ari Kachmer

**Title: Covert Learning: How to Learn withan Untrusted Intermediary**

**Abstract:**

We consider the task of learning a function via oracle queries, where the queries and responses are monitored (and perhaps also modified) by an untrusted intermediary. Our goal is twofold: First, we would like to prevent the intermediary from gaining any information about either the function or the learner’s intentions (e.g. the particular hypothesis class the learner is considering). Second, we would like to curb the intermediary’s ability to meaningfully interfere with the learning process, even when it can modify the oracles’ responses.

Inspired by the works of Ishai et al. (Crypto 2019) and Goldwasser et al. (ITCS 2021), we formalize two new learning models, called Covert Learning and Covert Verifiable Learning, that capture these goals. Then, assuming hardness of the Learning Parity with Noise (LPN) problem, we show:

- Covert Learning algorithms in the agnostic setting for parity functions and decision trees, where a polynomial time eavesdropping adversary that observes all queries and responses learns nothing about either the function, or the learned hypothesis.
- Covert Verifiable Learning algorithms that provide similar learning and privacy guarantees, even in the presence of a polynomial-time adversarial intermediary that can modify all oracle responses. Here the learner is granted additional random examples and is allowed to abort whenever the oracles responses are modified.

Aside theoretical interest, our study is motivated by applications to the outsourcing of automated scientific discovery in drug design and molecular biology. It also uncovers limitations of current techniques for defending against model extraction attacks.

Joint work with Ran Canetti

**Speaker:** Jie Wang (UMass Lowell)

**Title: Contextual Networks and Unsupervised Ranking of Text**

**Abstract:**

Can machines rank sentences for a given article as accurate as humans do and do so in real time? We devise CNATAR (Contextual Network and Topic Analysis Rank) to accomplish this task. CNATAR constructs a contextual network to capture semantic and syntactic relations between texts in an article by leveraging contextual word embeddings and dependency trees. It scores nodes of the contextual network with location-biased PageRank and then scores sentences. Next, it ranks sentences by approximating a bi-objective 0-1 knapsack maximization problem based on topic analysis and sentence scores computed earlier. We show that CNATAR outperforms the combined ranking of all human judges over SummBank (a commercial dataset that provides ranking benchmarks) under both ROUGE and BLEU metrics, and substantially outperforms each judge’s individual ranking. Moreover, CNATAR meets the real-time requirement. We also carry out extensive experiments on other summarization datasets and compare CNATAR with a number of recent supervised and unsupervised summarization algorithms. I’ll present a live demo of CNATAR in connection to hierarchical reading.

Joint work with PhD students Hao Zhang (graduated, now at Apple Cupertino) and You Zhou.

**Speaker:** Talya Eden

**Title: Approximating the Arboricity in Sublinear Time**

**Abstract:**

We consider the problem of approximating the arboricity of a graph $G=(V,E)$, which we denote by $arb(G)$, in sublinear time, where the arboricity of a graph is the minimal number of forests required to cover its edges. An algorithm for this problem may perform degree and neighbor queries, and is allowed a small error probability. We design an algorithm that outputs an estimate $\hat{\alpha}$, such that with probability $1-1/poly(n)$, $arb(G)/c log2 n \leq \hat{\alpha} \leq \arb(G)$ , where $n=|V|$ and $c$ is a constant. The expected query complexity and running time of the algorithm are $O(n/arb(G))\cdot \poly(\log n)$, and this upper bound also holds with high probability. This bound is optimal for such an approximation up to a $poly(log n)$ factor. This result has important implications as many sublinear-time algorithms are parameterized by the arboricity, and rely on getting its value as input.

Based on joint work with Saleet Mossel and Dana Ron.

**Speaker:** Kyle Burke (Plymouth State University)

**Title: Undirected Geography Nimbers are PSPACE-hard**

**Abstract:**

The impartial combinatorial game Undirected Geography is known to be in P; we can quickly determine whether the first player has a winning strategy without traversing the entire game tree. Aside from winnability, there is also the notion of nimbers (a.k.a. Grundy values) which tells players how the game behaves when included as part of a (disjunctive) sum. We show that determining the nimber of Undirected Geography is PSPACE-hard; the first such impartial game where the hardness of winnability and nimbers are different.

In this talk, we will introduce Undirected Geography, explain the concept of nimbers of impartial games, and then describe the reduction showing that Undirected Geography nimbers are hard. We will then discuss the ramifications of this and play an instance of a resulting PSPACE-hard game. This talk will be geared towards a CS theory audience and will not require any background in Combinatorial Game Theory.

**Speaker:** Chara Podimata (Harvard University)

**Title: Contextual Search in the Presence of Irrational Agents**

**Abstract:**

We study contextual search, a generalization of binary search in higher dimensions, which captures settings such as feature-based dynamic pricing. Standard game-theoretic formulations of this problem assume that agents act in accordance with a specific behavioral model. In practice, however, some agents may not subscribe to the dominant behavioral model or may act in ways that seem to be arbitrarily irrational. Existing algorithms heavily depend on the behavioral model being (approximately) accurate for all agents and have poor performance in the presence of even a few such arbitrarily irrational agents. We initiate the study of contextual search when some of the agents can behave in ways inconsistent with the underlying behavioral model. In particular, we provide two algorithms, one based on multidimensional binary search methods and one based on gradient descent. We show that these algorithms attain near-optimal regret guarantees in the absence of irrational agents and their performance degrades gracefully with the number of such agents, providing the first results for contextual search in any adversarial noise model. Our techniques draw inspiration from learning theory, game theory, high-dimensional geometry, and convex analysis.

This paper appeared in STOC21 and is joint work with Akshay Krishnamurthy (MSR NYC), Thodoris Lykouris (MIT), and Robert Schapire (MSR NYC).

**Speaker:** Lydia Zakynthinou

**Title: Differentially Private Covariance-Aware Mean Estimation**

**Abstract:**

Covariance-aware mean estimation is a fundamental problem in statistics, where we are given n i.i.d. samples from a d-dimensional distribution with mean $\mu$ and covariance $\Sigma$ and the goal is to find an estimator $\hat\mu$ with small error $\|\hat\mu-\mu\|_{\Sigma}\leq \alpha$, where $\|\cdot\|_{\Sigma}$ denotes the Mahalanobis distance. It is known that the empirical mean of the dataset achieves this guarantee if we are given at least $n=\Omega(d/\alpha^2)$ samples. The empirical mean and other statistical estimators can reveal sensitive information about the samples of the training dataset. To protect the privacy of the individuals who participate in the dataset, we study statistical estimators which satisfy differential privacy, a condition that has become a standard criterion for individual privacy in statistics and machine learning.

In this talk, I will present two new differentially private mean estimators for d-dimensional (sub)Gaussian distributions with unknown covariance whose sample complexity is optimal up to logarithmic factors and matches the non-private one in many parameter regimes. All previous estimators with the same guarantee either require strong a priori bounds on the covariance matrix or require $\Omega(d^{3/2})$ samples. We will focus on one of the estimators, which is based on a simple idea of perturbing the empirical mean of the dataset with noise calibrated to its empirical covariance, and we will explain the challenges involved in making this estimator differentially private.

Based on the paper https://arxiv.org/pdf/2106.13329.pdf, which will appear at NeurIPS 2021 and is joint work with Gavin Brown, Marco Gaboardi, Adam Smith, and Jonathan Ullman.

**Speaker:** Alon Eden

**Title:Private Interdependent Valuations**

**Abstract:**

We consider the single-item *interdependent value setting*, where there is a single item sold by a monopolist, $n$ buyers, and each buyer has a private signal $s_i$ describing a piece of information about the item. Additionally, each bidder $i$ has a valuation function $v_i(s_1,\ldots,s_n)$ mapping the (private) signals of all buyers into a positive real number representing their value for the item. This setting captures scenarios where the item’s information is asymmetric or dispersed among agents, such as in competitions for oil drilling rights, or in auctions for art pieces. Due to the increased complexity of this model compared to the standard private values model, it is generally assumed that each bidder’s valuation function $v_i$ is \textit{public knowledge} to the seller or all other buyers. But in many situations, the seller may not know the bidders’ valuation functions—how a bidder aggregates signals into a valuation is often their private information. In this talk, I will present mechanisms that guarantee approximately-optimal social welfare while satisfying ex-post incentive compatibility and individually rationality for the case where the valuation functions are private to the bidders, and thus may be strategically misreported to the seller.

Based on a joint work with Kira Goldner (BU) and Shuran Zheng (Harvard)

**Speaker:** Maryam Aliakbarpour

**Title: Testing Properties of Multiple Distributions with Few Samples**

**Abstract:**

In this talk, we present a new setting for testing properties of distributions while receiving samples from several distributions, but few samples per distribution. The main motivation for considering this setting is that it captures data collection in the real world. We explain a brief description of our testers for the following problems in this setting when given samples from s distributions, p_1, p_2, . . . , p_s: (1) Uniformity Testing: Testing whether all the p_i’s are uniform or eps-far from being uniform in \ell_1-distance (2) Identity Testing: Testing whether all the p_i’s are equal to an explicitly given distribution q or eps-far from q in \ell_1-distance, and (3) Closeness Testing: Testing whether all the p_i’s are equal to a distribution q which we have sample access to, or eps-far from q in \ell_1-distance. By assuming an additional natural condition about the source distributions, we provide sample optimal testers for all of these problems.

Joint work with Sandeep Silwal

**Speaker:** Iden Kalemaj

**Title: Sublinear-Time Computation in the Presence of Online Erasures**

**Abstract:**

We initiate the study of sublinear-time algorithms that access their input via an online adversarial erasure oracle. After answering each query to the input object, such an oracle can erase *t* input values. Our goal is to understand the complexity of basic computational tasks in extremely adversarial situations, where the algorithm’s access to data is blocked during the execution of the algorithm in response to its actions. Specifically, we focus on property testing in the model with online erasures. We show that two fundamental properties of functions, linearity and quadraticity, can be tested for constant *t* with asymptotically the same complexity as in the standard property testing model. For linearity testing, we prove tight bounds in terms of *t*, showing that the query complexity is ϴ(log *t*). In contrast to linearity and quadraticity, some other properties, including sortedness and the Lipschitz property of sequences, cannot be tested at all, even for *t =* 1. Our investigation leads to a deeper understanding of the structure of violations of linearity and other widely studied properties.

This is joint work with Sofya Raskhodnikova (BU) and Nithin Varma (Chennai Mathematical Institute). It will appear at ITCS 2022.

**Speaker:** Mahsa Derakhshan

**Title: Approximation algorithms for Maximum Matching and Minimum Vertex Cover on Stochastic Graphs**

**Abstract:**

We consider the maximum matching problem in the following stochastic setting. Let $G$ be a given graph, $p \in (0, 1]$ a parameter of the problem, and let $G_p$ be a random subgraph that includes each edge of $G$ independently with probability $p$. We are unaware of the realization $G_p$, but can learn if an edge $e$ exists in $G_p$ by querying it. The goal is to find an approximate maximum matching of $G_p$ by querying only a few edges of the graph non-adaptively. While this problem has been studied extensively, before our work, the best-known approximation ratio for unweighted graphs was almost 2/3, and slightly better than ½ for weighted graphs. In this talk, we present algorithms that find almost optimal matchings despite the uncertainty in the graph by conducting only a constant number of queries per vertex. We will also show how this algorithm can be used for finding (1+\epsilon)-approximate minimum vertex cover on bipartite stochastic graphs.

This talk is based on three joint works with Soheil Behnezhad, Avrim Blum, and Mohammad Taghi Hajiaghayi.

**Speaker:** Nathan Harms

**Title: ****VC Dimension and Distribution-Free Sample-Based Testing**

**Abstract:**

We consider the problem of determining which classes of functions can be tested more efficiently than they can be learned, in the distribution-free sample-based model that corresponds to the standard PAC learning setting. Our main result shows that while VC dimension by itself does not always provide tight bounds on the number of samples required to test a class of functions in this model, it can be combined with a closely-related variant that we call “lower VC” (or LVC) dimension to obtain strong lower bounds on this sample complexity. We use this result to obtain strong and in many cases nearly optimal bounds on the sample complexity for testing unions of intervals, halfspaces, intersections of halfspaces, polynomial threshold functions, and decision trees.

**Speaker:** Ronitt Rubinfeld

**Title: Locality in Computation**

**Abstract:**

Consider a setting in which inputs to and outputs from a computational problem are so large, that there is not time to read them in their entirety. However, if one is only interested in small parts of the output at any given time, is it really necessary to solve the entire computational problem? Is it even necessary to view the whole input? We survey recent work in the model of “local computation algorithms” which for a given input, supports queries by a user to values of specified bits of a legal output. The goal is to design local computation algorithms in such a way that very little of the input needs to be seen in order to determine the value of any single bit of the output. Though this model describes sequential computations, techniques from local distributed algorithms have been extremely important in designing efficient local computation algorithms. In this talk, we describe results on a variety of problems for which sublinear time and space local computation algorithms have been developed — we will give special focus to finding maximal independent sets and generating random objects.

## Spring 2021

Date | Speaker | Talk Title |

January 25 | Social | |

February 1 | John Kallaugher (UT Austin) | Separations and Equivalences between Turnstile Streaming and Linear Sketching |

February 8 | Andrew McGregor (UMass) | Recent Results on Cycle Counting in the Data Stream Model |

February 16 | Ken Hoover (UCSD) | Lifting for constant-depth circuits and applications to MCSP |

February 22 | Krzysztof Onak (CDS BU) | Dynamic Algorithms for Tracking Patterns in Sparse Graphs |

March 1 | Kira Goldner (Columbia) | Mechanism Design for Social Good |

March 8 | Ohad Trabelsi (Weizmann) | New algorithms and lower bounds for all-pairs max-flow |

March 15 | Shuchi Chawla (UW–Madison) | Pandora’s Box with Correlations: Learning and Approximation |

March 22 | Andrea Lincoln (UC Berkeley) | New Techniques for Proving Fine-Grained Average-Case Hardness |

March 29 | Nicole Wein (MIT) | Approximating theDiameterof a Graph |

April 5 | Luowen Qian | Tight Quantum Time-Space Tradeoffs for Function Inversion |

April 12 | Omri Ben-Eliezer (Harvard) | Adversarially Robust Streaming Algorithms |

April 21 | Nick Spooner (BU) | Post-Quantum Succinct Arguments |

April 26 | Jonathan Huggins (CDS BU) | Statistical Inference with Stochastic Gradient Algorithms |

### Abstracts

**Speaker:** John Kallaugher

**Title: Separations and Equivalences between Turnstile Streaming and Linear Sketching**

**Abstract:**

A longstanding observation, which was partially proven in [Li,Nguyên, Woodruff `14] and [Ai, Hu, Li, Woodruff `16], is that any turnstile streaming algorithm can be implemented as a linear sketch (the reverse is trivially true). We study the relationship between turnstile streaming and linear sketching algorithms in more detail, giving both new separations and new equivalences between the two models.

It was shown in[Li,Nguyên, Woodruff `14]that, if a turnstile algorithm works for arbitrarily long streams with arbitrarily large coordinates at intermediate stages of the stream, then the turnstile algorithm is equivalent to a linear sketch. We show separations of the opposite form: if either the stream length or the maximum value of the stream are substantially restricted, there exist problems where linear sketching is exponentially harder than turnstile streaming.

A further limitation of the[Li,Nguyên, Woodruff `14]equivalence is that the turnstile sketching algorithm is neither explicit nor uniform, but requires an exponentially long advice string. We show how to remove this limitation for deterministic streaming algorithms: we give an explicit small-space algorithm that takes the streaming algorithm and computes an equivalent module.

Joint work with Eric Price, manuscript athttps://arxiv.org/abs/1905.

**Speaker:** Andrew McGregor

**Title: Recent Results on Cycle Counting in the Data Stream Model**

**Abstract:**

Estimating the number of cycles in a graph is one of the most widely studied graph problems in the data stream model. From the theoretical perspective, it is a convenient problem to explore the efficacy of various sampling techniques and to understand the limits of stream computation. From an applied perspective, the frequency of cycles and other subgraphs captures interesting structural properties of massive real-world graphs. Three relevant variants of the data stream model include: the arbitrary order model in which the stream consists of the edges of the graph in arbitrary order, the random order model in which the edges are randomly permuted, and the adjacency list order model in which all edges incident to the same vertex appear consecutively. In this talk, we survey recent work in all these models.

**Speaker:** Ken Hoover

**Title: Lifting for constant-depth circuits and applications to MCSP**

**Abstract:**

Lifting arguments show that the complexity of a function in one model is essentially that of a related function (often the composition of the original function with a small function called a gadget) in a more powerful model. Lifting has been used to prove strong lower bounds in communication complexity, proof complexity, circuit complexity and many other areas.

We present a lifting construction for constant depth unbounded fan-in circuits that given a function f, constructs a function g, so that the depth d+1 (average-case) complexity of g is controlled by the depth d (average-case) complexity of f. The function g is defined as f composed with a parity function. As an application, we show that being able to approximate the depth d (for any large constant d>3) circuit complexity of given (truth tables of) Boolean functions yields an algorithm for approximating the depth 3 circuit complexity of functions. These results are stated as reductions between gap-versions of AC0-MCSP. Our lifting results rely on a novel blockwise switching lemma that may be of independent interest.

We also show some barriers on improving the efficiency of our reductions: such improvements would either yield surprisingly efficient algorithms for MCSP (violating a new Weak Exponential Time Hypothesis for AC0-MCSP that we propose), or stronger than known AC0 circuit lower bounds.

**Speaker:** Krzysztof Onak

**Title: Dynamic Algorithms for Tracking Patterns in Sparse Graphs**

**Abstract:**

Counting graph patterns has various practical applications, including detecting fraud and anomalies in financial transactions and social networks. While the vast majority of previous work focuses on the static case and computing the global number of occurrences of a given pattern, I will show that using simple techniques, it is possible to efficiently track the number of copies of a pattern in which each vertex appears. I will demonstrate our techniques for simple graph patterns such as triangles, and cliques and cycles of size 4. As long as the graph has low arboricity, the update time of our algorithms is significantly sublinear in the number of vertices.

Joint work with Karthikeyan Shanmugam and Shay Solomon.

**Speaker:** Ohad Trabelsi

**Title: New algorithms and lower bounds for all-pairs max-flow**

**Abstract:**

When can maximum flow be solved for all pairs of nodes faster than naively solving it separately for each pair?

We will overview new algorithms – including a very recent result!, and also lower bounds (under some popular assumptions).

**Speaker:** Shuchi Chawla

**Title: Pandora’s Box with Correlations: Learning and Approximation**

**Abstract:**

The Pandora’s Box problem and its extensions capture optimization problems with stochastic input where the algorithm can obtain instantiations of input random variables at some cost. To our knowledge, all previous work on this class of problems assumes that different random variables in the input are distributed independently. As such it does not capture many real-world settings. In this work, we provide the first approximation algorithms for Pandora’s Box-type problems with correlations. We assume that the algorithm has access to samples drawn from the joint distribution on input.

Algorithms for these problems must determine an order in which to probe random variables, as well as when to stop and return the best solution found so far. In general, an optimal algorithm may make both decisions adaptively based on instantiations observed previously. Such fully adaptive (FA) strategies cannot be efficiently approximated to within any sublinear factor with sample access. We therefore focus on the simpler objective of approximating partially adaptive (PA) strategies that probe random variables in a fixed predetermined order but decide when to stop based on the instantiations observed. We consider a number of different feasibility constraints and provide simple PA strategies that are approximately optimal with respect to the best PA strategy for each case. All of our algorithms have polynomial sample complexity. We further show that our results are tight within constant factors: better factors cannot be achieved even using the full power of FA strategies.

This is joint work with Evangelia Gergatsouli, Yifeng Teng, Christos Tzamos, and Ruimin Zhang.

**Speaker:** Andrea Lincoln

**Title: New Techniques for Proving Fine-Grained Average-Case Hardness**

**Abstract:**

In this talk I will cover a new technique for worst-case to average-case reductions. There are two primary concepts introduced in this talk: “factored” problems and a framework for worst-case to average-case fine-grained (WCtoACFG) self reductions.

We will define new versions of OV, kSUM and zero-k-clique that are both worst-case and averagecase fine-grained hard assuming the core hypotheses of fine-grained complexity. We then use these as a basis for fine-grained hardness and average-case hardness of other problems. Our hard factored problems are also simple enough that we can reduce them to many other problems, e.g. to edit distance, k-LCS and versions of Max-Flow. We further consider counting variants of the factored problems and give WCtoACFG reductions for them for a natural distribution.

To show hardness for these factored problems we formalize the framework of [Boix-Adsera et al. 2019] that was used to give a WCtoACFG reduction for counting k-cliques. We define an explicit property of problems such that if a problem has that property one can use the framework on the problem to get a WCtoACFG self reduction.

In total these factored problems and the framework together give tight fine-grained average-case hardness for various problems including the counting variant of regular expression matching.

Based on joint work with Mina Dalirrooyfard and Virginia Vassilevska Williams.

**Speaker:** Nicole Wein

**Title: Approximating the Diameter of a Graph**

**Abstract:**

I will talk about algorithms and conditional lower bounds for approximating thediameterof a graph (the largest distance between two vertices). The best known algorithm for exactly computing thediameterof a graph is the naive algorithm of computing all-pairs shortest paths (APSP) and taking the largest distance. Furthermore, no significantly better algorithm exists under the Strong Exponential Time Hypothesis. Running APSP can be prohibitively slow for very large graphs, so we seek much faster algorithms that produce an approximation of thediameter. I will discuss the landscape of time vs. accuracy trade-off algorithms and hardness fordiameter.

**Speaker:** Luowen Qian

**Title: Tight Quantum Time-Space Tradeoffs for Function Inversion**

**Abstract:**

In function inversion, we are given oracle query access to a function *f*: [N] → [N], and want to prepare some advice of size S, such that we can efficiently invert any image in time T. This is a well studied problem with profound connections to cryptography, data structures, communication complexity, and circuit lower bounds. It is natural to ask whether we can combine two well-known algorithms from the last century for this problem: one being the classical Hellman’s algorithm achieving tradeoff ST = N (for permutations), and the other being Grover’s search which uses quantum time ︀ T = √︀N but no advice.

We show that ST+ T² = ̃Ω(N) is required for any quantum algorithm to invert random functions, which is best possible without implying new (classical) circuit lower bounds as shown by Corrigan-Gibbs and Kogan (2019). Prior knowledge on quantum computation is not assumed but will be helpful.

This is based on joint work with Kai-Min Chung, Siyao Guo, and Qipeng Liu, which appeared in FOCS 2020.

**Speaker:** Omri Ben-Eliezer

**Title: Adversarially Robust Streaming Algorithms **

**Abstract:**

Streaming algorithms are traditionally categorized into two classes: deterministic algorithms and randomized ones. These two types of algorithms exhibit a tradeoff between robustness and efficiency; on one hand, deterministic algorithms always return a correct output, but for many important streaming problems, they are provably inefficient. On the other hand, efficient randomized algorithms are known for a very wide range of problems, but their correctness typically relies on the assumption that the stream is fixed in advance, which is commonly unrealistic in practice.

In this talk, I will focus on a middle ground that combines both robustness and efficiency: adversarially robust algorithms, whose output is correct with high probability even when the stream updates are adaptively chosen as a function of previous outputs. This regime has received surprisingly little attention until recently, and many intriguing problems are still open. I will mention some of the recent results, discussing algorithms that are well-suited for the adversarially robust regime (random sampling), algorithms that are not robust (linear sketching), and efficient techniques to turn algorithms that work in the standard static setting into robust streaming algorithms.

The results demonstrate strong connections between streaming and other areas such as online learning, differential privacy, cryptography, adaptive data analysis and statistics.

Based on joint works with Noga Alon, Yuval Dagan, Rajesh Jayaram, Shay Moran, Moni Naor, David Woodruff, and Eylon Yogev.

**Speaker:** Nick Spooner

**Title: Post-Quantum Succinct Arguments**

**Abstract:**

A succinct argument is a proof system for NP where the total communication is much smaller than the NP witness. Almost 30 years ago, Kilian showed how to build succinct arguments with security against classical adversaries, under standard cryptographic assumptions. In this work, we show that the same construction achieves securityagainst quantum adversaries, under a standard post-quantum assumption. We achieve this by designing a quantum rewinding procedure which achieves “optimal” extraction error.

**Speaker:** Jonathan Huggins

**Title: Statistical Inference with Stochastic Gradient Algorithms**

**Abstract:**

Stochastic gradient algorithms are widely used for large-scale learning and inference problems. However, their use in practice is typically guided by heuristics and trial-and-error rather than rigorous, generalizable theory. We take a step toward better understanding the effect of tuning parameter choices by characterizing the large-sample behavior of iterates of a very general class of preconditioned stochastic gradient algorithms with fixed step size, including stochastic gradient descent with and without additional Gaussian noise, momentum, and/or acceleration. We show that near a local optimum, the iterates converge weakly to paths of an Ornstein–Uhlenbeck process, and provide sufficient conditions for the stationary distributions of the finite-sample processes to converge weakly to that of the limiting process. In particular, with appropriate choices of tuning parameters, the limiting stationary covariance can match either the Bernstein–von Mises-limit of the posterior, adjustments to the posterior for model misspecification, or the asymptotic distribution of the maximum likelihood estimate – and that with a naive tuning, the limit corresponds to none of these. Moreover, we argue that, in the large-sample regime, an essentially independent sample from the stationary distribution can be obtained after a fixed number of passes over the dataset. Our results show that properly tuned stochastic gradient algorithms offer a practical approach to obtaining inferences that are computationally efficient and statistically robust.

## Fall 2020

Date | Speaker | Talk Title |

September 14 | Marco Carmosino | Uniform and Fine-Grained Derandomization |

September 21 | Ashok Cutkosky | Low-Stress Analysis for Non-Convex Optimization |

September 28 | Alina Ene | A gentle introduction to adaptive optimization |

October 5 | Mary Wootters (Stanford) | Sharp thresholds for random subspaces, and applications to list-decoding |

October 13 | Adrian Vladu (IRIF) | Circulation Control for Faster Minimum Cost Flow in Unit-Capacity Graphs |

October 19 | Manuel Sabin (Cohubicol) | Discriminatory and Liberatory Algorithms: How Do We Define “Fair” Responsibly? |

October 26 | Eric Blais (University of Waterloo) | Forecasting and the complexity of randomized algorithms |

November 2 | Fabian Spaeh | Submodular Jaccard Distances for Clusterings |

November 9 | Nathan Dang | Voronoi Decomposition of Aperiodic Sets Closed Under Fixed-Parameter Extrapolation |

November 16 | Iden Kalemaj | Isoperimetric Inequalities for Real-Valued Functions with Applications to Monotonicity Testing |

November 23 | Satchit Sivakumar | Perfect Sampling of k-colorings |

November 30 | Nithin Varma (University of Haifa) | New Lower Bounds and Sublinear Algorithms for LIS estimation |

December 7 | Rahul Ilango(MIT) | Towards Hardness for the Minimum Circuit Size Problem |

### Abstracts

**Speaker:** Marco Carmosino

**Title: Uniform and Fine-Grained Derandomization**

**Abstract:**

Can efficient algorithms that use random coins to make choices be transformed into deterministic algorithms? If so, at what cost? Can every randomized algorithm be “de-randomized” via a generic construction, or must they be considered on a case-by-case basis? These are central questions in complexity theory.

In this talk, we construct “generic” derandomization from reasonable assumptions. We will cover recent progress, where intractability conjectures from the emerging field of “fine-grained” complexity theory are used to obtain generic and efficient de-randomizations of randomized algorithms. The intractability conjectures we use are non-cryptographic, plausible, and natural. This derandomization technique translates between a problem-centric and resource-centric view of computational complexity theory. We will compare this to previous conditional derandomizations and conclude with some open problems.

This talk is based on joint work with Russell Impagliazzo and Manuel Sabin, “Fine-Grained Derandomization: From Problem-Centric to Resource-Centric Complexity” which appeared at ICALP 2018.

**Speaker:** Ashok Cutkosky

**Title: Low-Stress Analysis for Non-Convex Optimization**

**Abstract:**

The last few years have seen many great advances in the problem of finding critical points in stochastic non-convex optimization. Under various different assumptions (e.g. second-order smoothness, two-point gradient oracles, second-order oracles), different algorithms have been proposed that improve on the convergence rate of SGD. For the most part, the analyses of these algorithms are distinct and complicated, but this need not be! In this talk, we will see how to obtain optimal algorithms in all these settings via variations on normalized SGD with momentum.

**Speaker:** Alina Ene

**Title: A gentle introduction to adaptive optimization**

**Abstract:**

In this talk, we take a brief look at some of the most popular and influential iterative algorithms for optimizing modern machine learning models. We will introduce the celebrated Adagrad algorithm (Duchi, Hazan, and Singer; McMahan and Streeter) that uses past gradient information to adaptively set its step sizes. We will briefly discuss some remarkable properties of algorithms in the Adagrad family, including their ability to automatically adapt to unknown problem structure such as (local or global) smoothness and convexity.

If time permits, we will also touch upon recent progress on obtaining accelerated Adagrad algorithms that simultaneously achieve optimal convergence in the smooth, non-smooth, and stochastic setting, without any knowledge of the smoothness parameters or the variance of the stochastic gradients. This is a very active area of research with many recent works, but we will mostly take about this recent work:https://arxiv.org/abs/2007.

**Speaker:** Mary Wootters

**Title: Sharp thresholds for random subspaces, and applications to list-decoding**

**Abstract:**

What combinatorial properties are likely to be satisfied by a random subspace over a finite field? For example, is it likely that not too many points lie in any Hamming ball? What about any cube? We show that there is a sharp threshold on the dimension of the subspace at which the answers to these questions change from “extremely likely” to “extremely unlikely,” and moreover we give a simple characterization of this threshold for different properties. Our motivation comes from error correcting codes, and we use this characterization to make progress on the questions of list-decoding and list-recovery for random linear codes, and also to establish the list-decodability of random Low Density Parity-Check (LDPC) codes.

This talk is based on the joint works with Venkatesan Guruswami, Ray Li, Jonathan Mosheiff, Nicolas Resch, Noga Ron-Zewi, and Shashwat Silas.

**Speaker:** Adrian Vladu

**Title: Circulation Control for Faster Minimum Cost Flow in Unit-Capacity Graphs**

**Abstract:**

We present an m^{4/3+o(1)} log W -time algorithm for solving the minimum cost flow problem in graphs with unit capacity, where W is the maximum absolute value of any edge weight. For sparse graphs, this improves over the best known running time for this problem and, by well-known reductions, also implies improved running times for the shortest path problem with negative weights, minimum cost bipartite b-matching when |b|_1 = O(m), and recovers the running time of the currently fastest algorithm for maximum flow in graphs with unit capacities, due to Liu and Sidford (FOCS, 2020).

Our algorithm relies on developing an interior point method–based framework which acts on the space of circulations in the underlying graph. From the combinatorial point of view, this framework can be viewed as iteratively improving the cost of a suboptimal solution by pushing flow around some carefully chosen circulations.

This talk will be self-contained and will assume no prior background in optimization. Based onhttps://arxiv.org/abs/2003.

**Speaker:** Manuel Sabin

**Title: Discriminatory and Liberatory Algorithms: How Do We Define “Fair” Responsibly?**

**Abstract:**

The nascent field of Algorithmic Fairness recognizes that algorithms have the potential to perpetuate and codify systemic discrimination and attempts the noble goal of defining notions of “Fairness” that will mitigate this. The past year, however, has seen many critiques of the field’s direction and methodologies, illustrating how the field itself is in danger of perpetuating and codifying those same systems of discrimination.

This talk will review some of these critiques and, from these, identify overarching root problems that the field seems susceptible to. After exploring three large sociotechnical consequences of the field’s current use of the word “Fairness,” we will present ongoing work for a framework that, as we will argue, allows future Algorithmic Fairness research to be less prone to these consequences. I will also briefly review methodologies and philosophies from other areas such as Sociology, Human-Computer Interaction, Community Organizing, Medicine, and others that have grappled with many of the same disciplinary issues facing Algorithmic Fairness as inspiration for setting new norms and notions of rigor for this emerging field.

While we use critiques of Algorithmic Fairness as guidance, the goal of this talk will not be to further critique the field but to find ways to address critiques and let us discuss ways to help the field grow into a responsible one.

**Speaker:** Eric Blais

**Title: Forecasting and the complexity of randomized algorithms**

**Abstract:**

Randomness is a remarkably powerful tool in the design of algorithms. By giving algorithms the ability to use random bits and letting them err with some small probability, we can solve many computational problems with remarkable efficiency. However, even the most basic questions about randomized algorithms have proven to be remarkably challenging, and many of those questions even remain open to this day.

In this talk, we will explore the notion of forecasting algorithms, a slight variant on the usual model of randomized algorithms where the algorithm must generate a confidence score along with its output. We will see how forecasting algorithms allow us to bypass some of the inherent difficulties in the analysis of bounded-error randomized algorithms, and lead to new developments on some of fundamental problems related to the composition conjecture for query complexity and to Yao’s minimax lemma.

This talk will cover material from joint work with Shalev Ben-David that can be found athttps://arxiv.org/abs/2002.108

**Speaker:** Fabian Spaeh

**Title: Submodular Jaccard Distances for Clusterings**

**Abstract:**

Different clusterings of the same dataset — a common result of the vast supply of different clustering techniques, data partitioned along a distributed database, or even performance and privacy considerations — often cause further confusion rather than offering insights into the data.

To advance in the methods and capabilities of explorative data analysis, measures for comparing clusterings have been developed along with optimization techniques aiming to unify a family of clusterings into a single representative.

In this talk, we begin with defining a novel distance measure for clustering based on a submodular version of the Jaccard distance. Other than previous measures, this allows for an interpretation of data points in a given space via a submodular parameter function.

We then formalize “combining clusterings” by posing an optimization problem asking for the most centrally located clustering between two given clustering (a so-called centroid) with respect to this new distance measure.

Our complexity analysis shows that even the standard Jaccard distance induces an NP-equivalent optimization problem and that it is impossible to approximate the general problem to more than a trivial factor of 2.

**Speaker:** Nathan Dang

**Title: Voronoi Decomposition of Aperiodic Sets Closed Under Fixed-Parameter Extrapolation**

**Abstract:**

In this talk, we begin with the properties and construction of sets Q_λ(S), λ ∈ C by fixed-parameter extrapolation. These point sets provide some intuition of how the Q_λ(S) sets relate to properties of “quasicrystals.” Namely, such sets display some ordered crystal-like properties but in the sense that they have no translational symmetry, and other central results are studied in depth by Fenner, Green, and Homer in [1].

We will also explore and visualize some first computational steps on how Q_λ(S) relates to aperiodic tilings. Namely, starting with some 2-dimensional Q_λ(S) sets, we calculate the Voronoi decomposition of Q_λ(S) using Fortune’s Algorithm. This is done by modifying an implementation of Fortune’s Algorithm to identify the distinct Voronoi cells that arise, and thereby obtain at least a lower bound on the number of tiles necessary. Ultimately, it is our hope that these computational experiments will deepen our understanding of the relationship

between fixed-point extrapolation and aperiodic tilings.

(1) S. Fenner, F. Green, and S. Homer. Fixed-parameter extrapolation and aperiodic order, 2018. arXiv:1212.2889, version 4, October 2018. https://arxiv.org/abs/1212.2889v6

**Speaker:** Iden Kalemaj

**Title: Isoperimetric Inequalities for Real-Valued Functions with Applications to Monotonicity Testing**

**Abstract:**

We study sublinear algorithms for monotonicity of real-valued functions. An algorithm for testing monotonicity queries a functionat a few locations and distinguishes between when the function is monotone and when it is far from monotone. An algorithm for approximating the distance to monotonicity returns the distance to monotonicity of the function with some additive and multiplicative error. We improve previous algorithms for these two tasksby showing that several isoperimetric inequalities about Boolean functions on the hypercube can be generalized to real-valued functions

A celebrated line of work [Margulis ’74, Talagrand ’93, Chakrabarty and Seshadhri ’13, Khot Minzer Safra ’15] studied the size of the ”boundary”between the points on which a Boolean function takes value 0 and the points on which it takes value 1. This boundary is defined in terms of the edges of the-dimensional hypercube, where the edges can be directed or undirected. In particular, the inequality of Khot, Minzer, and Safra ’15 implies all previous inequalities. It bounds the average square root of the number of decreasing edges incident on a vertex of the hypercube by the distance to monotonicity of the function. We generalize all the inequalities in this line of work to real-valued functions. Our main contribution is a Boolean decomposition technique that represents a real-valued function as a collection of Boolean functions that roughly capture the distance to monotonicity of the original function.In this talk, we prove the Boolean decompositiontheorem and touch on the improved algorithms for monotonicity testing and distance approximation.

This is joint work with Hadley Black and Sofya Raskhodnikova.

**Speaker:** Satchit Sivakumar

**Title: Perfect Sampling of k-colorings**

**Abstract:**

Markov Chain Monte Carlo is a frequently used technique to approximately obtain samples from complex target distributions. But what do you do when you want exact samples? A seminal paper of Propp and Wilson describes a method called Coupling from the Past that returns exact samples from the stationary distribution of the Markov Chain. The catch is that CFTP may not be efficiently implementable when the state space of the Markov Chain is large.

In this talk, I discuss recent progress made by Siddharth Bhandari and Sayantan Chakraborty (https://arxiv.org/abs/1909.10323) on implementing CFTP for the problem of uniformly sampling k-colorings on graphs. Their algorithm builds on the Bounding Chain method for CFTP pioneered by Huber (https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.127.3503&rep=rep1&type=pdf). Time permitting, I will also touch on a recent follow up to this work by Vishesh Jain, Ashwin Sah and Mehtaab Sawhney (https://arxiv.org/abs/2007.06360) that improves the bound on number of colors through refined Bounding Chain arguments.

**Speaker:** Nithan Varma

**Title: New Lower Bounds and Sublinear Algorithms for LIS estimation**

**Abstract:**

Estimating the length of the longest increasing subsequence (LIS) in an array is a problem of fundamental importance. Despite the significance of the LIS estimation and the amount of attention it has received, there are important aspects of the problem that are not yet fully understood. There are no better lower bounds for LIS estimation than the obvious bounds implied by testing monotonicity (for adaptive or nonadaptive algorithms). In this talk, I will present some recent results on LIS estimation, including the first nontrivial lower bound on its complexity. Additionally, I will discuss the high level ideas involved in the design of new LIS estimation algorithms that complement our lower bound and improve state of the art.

**Speaker:** Rahul Ilango

**Title: Towards Hardness for the Minimum Circuit Size Problem**

**Abstract:**

While understanding the complexity of the Minimum Circuit Size Problem (MCSP) is strongly motivated by connections to areas like learning theory, average-case complexity, and circuit complexity, we still know relatively little about the problem. In particular, it is a longstanding open problem to base the intractability of MCSP on standard worst-case assumptions.

In this talk, I will survey a series of recent works that prove hardness results for natural variants of MCSP. This includes NP-hardness for an oracle-version, a multi-output version, and a constant-depth formula version. I’ll discuss the last result in the most detail, and I’ll highlight an application that uses similar techniques to establish the existence of functions with a 2^{\Omega_d(n^1/5)} additive gap between their depth-d and depth-(d+1) formula complexity.

## Spring 2020

Date | Speaker | Talk Title |

January 27 | Yannai A. Gonczarowski (Microsoft Research) | The Menu-Size of Approximately Optimal Auctions |

February 3 | Jinshuo Dong (University of Pennsylvania) | How private are private algorithms? – A privacy CLT |

February 10 | Maryam Aliakbarpour (MIT) | Distribution testing and its new extensions |

February 24 | Marika Swanberg | A look at history independence: challenges, solutions, and applications |

March 16 | Gavin Brown | Fast learning requires good memory |

April 6 | Meghna Sengupta | Reduction of Worst-case SVP to Average-case LWE |

April 13 | Nir Weinberger (MIT) | Theoretical Guarantees on the EM Algorithm for High-Dimensional Gaussian Mixtures |

April 27 | Andrew Suh | Streaming Submodular Maximization with a Cardinality Constraint |

### Abstracts

**Speaker:**Yannai A. Gonczarowski

**Title:The Menu-Size of Approximately Optimal Auctions**

**Abstract:**We consider a monopolist who wishes to sell n goods to a single additive buyer, where the buyer’s valuations for the goods are drawn according to independent distributions. It is well known that—unlike in the single item case—the revenue-optimal auction (a pricing scheme) may be complex, sometimes requiring a continuum of menu entries, that is, offering the buyer a choice between a continuum of lottery tickets. It is also known that simple auctions with a finite bounded number of menu entries (lotteries for the buyer to choose from) can extract a constant fraction of the optimal revenue, as well as that for the case of bounded distributions it is possible to extract an arbitrarily high fraction of the optimal revenue via a finite bounded menu size. Nonetheless, the question of the possibility of extracting an arbitrarily high fraction of the optimal revenue via a finite menu size, when the valuation distributions possibly have unbounded support (or via a finite bounded menu size when the support of the distributions is bounded by an unknown bound), remained open since the seminal paper of Hart and Nisan (2013), and so has the question of any lower-bound on the menu-size that suffices for extracting an arbitrarily high fraction of the optimal revenue when selling a fixed number of goods, even for two goods and even for i.i.d. bounded distributions.

In this talk, we resolve both of these questions. We first show that for every n and for every ε>0, there exists a menu-size bound C=C(n,ε) such that auctions of menu size at most C suffice for extracting a (1-ε) fraction of the optimal revenue from any valuation distributions, and give an upper bound on C(n,ε), even when the valuation distributions are unbounded and nonidentical. We then proceed to giving two lower bounds, which hold even for bounded i.i.d. distributions: one on the dependence on n when ε=1/n and n grows large, and the other on the dependence on ε when n is fixed and ε grows small. Finally, we apply these upper and lower bounds to derive results regarding the deterministic communication complexity required to run an auction that achieves such an approximation.

Based upon:

* The Menu-Size Complexity of Revenue Approximation, Moshe Babaioff, Y. A. G., and Noam Nisan, STOC 2017

* Bounding the Menu-Size of Approximately Optimal Auctions via Optimal-Transport Duality, Y. A. G., STOC 2018

**Speaker:**Jinshuo Dong

**Title:How private are private algorithms? – A privacy CLT**

**Abstract:**It is important to understand the exact privacy guarantee provided by private algorithms developed in the past prosperous decade of differential privacy. In particular, any underestimation of actual privacy means unnecessary noise in the algorithm and loss in the final accuracy. We observe a central limit behavior in iterative private algorithms, which demonstrates the limit of the common $(\varepsilon,\delta)$ parametrization in most application scenarios. For the rescue, a new notion called Gaussian Differential Privacy (GDP) is proposed and a complete toolkit is developed. We carry out various experiments to show how much unnecessary loss of accuracy can be saved in deep learning applications. Based on joint work with Aaron Roth, Weijie Su, Zhiqi Bu and Qi Long.

**Speaker:**Maryam Aliakbarpour

**Title:Distribution testing and its new extensions**

**Abstract:**One of the most fundamental problems in learning theory is to view input data as random samples from an unknown distribution and make statistical inferences about the underlying distribution. In this talk, we focus on a notable example of such statistical task: testing properties of distributions. The goal is to design an algorithm that uses as few samples as possible from a distribution and distinguishes whether the distribution has the property, or it is $\epsilon$-far in $\ell_1$-distance from any distribution which has the property. In this talk, we explore several questions in the framework of distribution testing, such as (i) Is the distribution uniform? Or, is it far from being uniform? (ii) Is a pair of random variables independent or correlated? (iii) Is the distribution monotone? Moreover, we discuss extensions of the standard testing framework to more practical settings. For instance, we consider the case where the sensitivity of the input samples (e.g., patients’ medical records) requires the design of statistical tests that ensure the privacy of the individuals. We address this case by designing differentially private testing algorithms for several testing questions with (nearly)-optimal sample complexities.

**Speaker:**Marika Swanberg

**Title: A look at history independence: challenges, solutions, and applications**

**Abstract:**Implementations of data structures often unintentionally leak information which is not available via the standard interface–for example, the order that elements were inserted, whether an element was deleted, and more. In this talk, I will introduce history independent data structures, a formal guarantee that limits such leakage. I will then describe and analyze a specific history independent hashing scheme developed by Guy Blelloch and Daniel Golovin, including their interesting proof of history independence. Lastly, I explain some of the core challenges of implementing efficient history independent data structures and general solutions that I have observed in the literature. Time permitting, I will connect this to my larger research project of formalizing the EU data protection law called “the right to be forgotten.”

**Speaker:**Gavin Brown

**Title:Fast learning requires good memory**

**Abstract:**In this talk I will cover a line of work on time-space lower bounds for learning problems initiated by Ran Raz in 2016. We will see how parity learning on n-bit strings, a simple learning problem that can be solved by Gaussian elimination with O(n) samples and O(n^2) bits of memory, requires either \Omega(n^2) memory or exponentially many samples. We will also discuss a 2018 generalization by Garg, Raz, and Tal proving similar lower bounds for a broad class of learning problems. The proof relies on a connection between the learning problem and two-source extractors, which arise in the theory of pseudorandomness. Both papers model computation with branching programs, a general non-uniform model in which we can enforce bounded memory and associate state changes with the arrival of new samples.

“Fast learning requires good memory: a time-space lower-bound for parity learning,” Ran Raz, 2016

“Extractor-based time-space lower bounds for learning,” Sumegha Garg, Ran Raz, and Avishay Tal, 2018

**Speaker:** Meghna Sengupta

**Title:Reduction of Worst-case SVP to Average-case LWE**

**Abstract:**I will be presenting Chris Peikert’s work on the reduction from the worst-case Shortest Vector Problem (GapSVP) to the average case of thedecision version of the Learning with Errors (LWE) problem. Thereduction is interesting as it proves that the average case version ofthe LWE problem, which is used quite extensively in cryptography, isat least as hard as the worst case instance of the GapSVP Problem,which is a well-studied lattice problem that is, till now, assumed tobe hard. The reduction comprises two steps: the worst-case gapSVPproblem is reduced to the worst-case search LWE problem, which canthen be reduced to the average-case of the decisional LWE problem.

Reference – Chris Peikert, “Public-KeyCryptosystems from the Worst-Case Shortest Vector Problem” – STOC ‘09 – https://people.csail.mit.edu/cpeikert/pubs/svpcrypto.pdf

**Speaker:**Nir Weinberger

**Title:Theoretical Guarantees on the EM Algorithm for High-Dimensional Gaussian Mixtures**

**Abstract:**Despite the simplicity, widespread, and lengthy history of the expectation-maximization (EM) algorithm, theoretical guarantees on its convergence radius, required number of iterations, and statistical accuracy have only recently appeared. Most notably, [Balakrishnan et al. 2017] provided guarantees for general models, which are not sharp in general, and a sequence of papers [Xu et al. 2016, Daskalakis et al. 2017, Wu and Zhou 2019] considered the two-component Gaussian mixture model, and obtained sharp guarantees in case the two components have equal a priori weights.

In this talk, I will focus on the two-component Gaussian mixture model, but with non-equal weights (i.e., unbalanced). I will show a global convergence property for this iteration, which holds despite the lack of symmetry this model suffers from. The proof is based on an indirect argument, which quantifies the intuitive observation that the EM iteration should converge faster as the lower weight tends to zero. I will also show that the EM algorithm is adaptively optimal to the separation between the two components, both in terms of the statistical accuracy, and the required number of iterations. To the best of my knowledge, no other algorithm for this model has such favorable statistical and computational theoretical guarantees.*This is joint work with Guy Bresler.*

Time permitting, I will also discuss a related problem of designing a quantizer whose goal is to approximate a nonparametric regression function. I will describe the challenge of learning efficient quantizer from empirical data in a fully data-dependent way, and present an alternating minimization algorithm which systematically addresses this challenge.*This is joint work with Meir Feder.*

**Speaker:**Andrew Suh

**Title:****Streaming Submodular Maximization with a Cardinality Constraint**

**Abstract:**The problem of maximizing a non-negative submodular objective with a cardinality constraint is a setup that encompasses a wide class of problems from fields such as machine learning, data mining, combinatorial optimization, social networks, and many others. This problem by itself is NP-Hard, but various algorithms have been presented to provide a constant approximation. More specifically, this problem, when we impose the additional restriction of monotonicity, is relatively well-understood.

In this talk, I will present algorithms for this scenario in the streaming model where data arrives one by one in an arbitrary order and minimal space is allowed. I’ll start with a basic streaming algorithm for monotone functions (based on [Badanidiyuru et al. 2014]) in order to get a feel for the techniques that are involved. Then, I’ll present a streaming algorithm that works even for non-monotone submodular functions.

The algorithm I will present for non-monotone functions is optimal in the sense that it gives a (near) optimal approximation guarantee of 0.5 – eps when exponential post-processing computation time is allowed. When only polynomial-time post-processing is allowed, the algorithm still achieves an approximation of 0.278 – eps, beating the previously best polynomial-time approximation of 0.172 of [Feldman et al. 2018].

Joint work with Naor Alaluf, Alina Ene,Moran Feldman and Huy L. Nguyen, to appear in ICALP 2020.

## Fall 2019

Date | Speaker | Talk Title |

September 9 | Mark Bun | Private Hypothesis Selection |

September 16 | Peter Gacs | A Reliable Turing Machine |

September 23 | Michael Anastos(CMU) | Finding perfect matchings in random regular graphs in linear expected time |

September 30 | Emanuele Viola (NEU) | Some thoughts on lower bounds |

October 7 | Ryan Williams(MIT) | Towards Proving Strong Lower Bounds? The Phenomenon of Hardness Magnification |

October 15 (Tue) | Ramesh Krishnan Pallavoor | Approximating the Distance to Monotonicity of Boolean Functions |

October 21 | Talya Eden(MIT) | Sampling edges almost uniformly, parameterized by the arboricity |

October 28 | Ludmila Glinskih | Lower bounds for Read-Once Branching Programs for Tseitin formulas |

November 4 | Nathan Cordner | Continuous Facility Location Algorithms for k-Means and k-Median |

November 11 | Nadya Voronova | Testing convexity of functions over finite domains |

November 18 | Virginia Vassilevska Williams (MIT) | Limitations on All Known (and Some Unknown) Approaches to Matrix Multiplication (Distinguished Lecture) |

November 25 | Iden Kalemaj | Monotonicity testing of Boolean functions and isoperimetric type theorems |

December 2 | Alley Stoughton | Experimenting with Formal Languages Using Forlan |

December 9 | Paul Valiant(Brown) | Four Challenges in Learning |

### Abstracts

**Speaker:** Mark Bun

**Title:Private Hypothesis Selection**

**Abstract:**We investigate the problem of differentially private hypothesis selection: Given i.i.d. samples from an unknown probability distribution P and a set of m probability distributions H, the goal is to privately output a distribution from H whose total variation distance to P is comparable to that of the best such distribution.

We present several algorithms for this problem which achieve sample complexity similar to those of the best non-private algorithms. These give new and improved learning algorithms for a number of natural distribution classes. Our results also separate the sample complexities of private mean estimation under product vs. non-product distributions.

**Speaker:**Peter Gacs

**Title: A Reliable Turing Machine**

**Abstract:**We consider a computation model that is like a 1-tape Turing machine. It has an internal state (from a fixed finite set), and a head scanning a tape with cells having content from some finite alphabet. In each step the head can move at most one step, and the state and the content of the scanned cell can change. This change is dictated by a fixed transition function in most cases. But in difference from a standard Turing machine, with some small probability and independently in each time step, the change can be something else (it still must be local!).

The result is that there is a machine of this kind that, that for some positive noise bound, performs arbitrarily large computations. (We will spell out the precise input-output conventions.)

The construction and the proof are complex, similar to the hierarchical one used for reliable cellular automata, but we don’t know of any easy reduction from one result to the other.

**Speaker:**Michael Anastos

**Title:Finding perfect matchings in random regular graphs in linear expected time**

**Abstract:**In a seminal paper on finding large matchings in sparse random graphs, Karp and Sipser proposed two algorithms for this task. The second algorithm has been intensely studied, but due to technical difficulties, the first algorithm has received less attention. Empirical results suggest that the first algorithm is superior. We show that this is indeed the case, at least for random regular graphs. We show that w.h.p. the first algorithm will find a matching of size n/2 − O(log n) on a random r-regular graph (r = O(1)). We also show that the algorithm can be adapted to find a perfect matching w.h.p. in O(n) time, as opposed to O(n^(3/2)) time for the worst-case. This is based on joint work with Alan Frieze

**Speaker:**Emanuele Viola (NEU)

**Title:Some thoughts on lower bounds**

**Abstract:**We discuss computational lower bounds in a variety of models including circuits, data structures, and Turing machines. Free of diagonalization, the talk is in part historical and speculative. But we also cover recent results in each of the models just listed.

**Speaker:**Ryan Williams

**Title:****Towards Proving Strong Lower Bounds? The Phenomenon of Hardness Magnification**

**Abstract:**Starting with Oliveira and Santhanam [FOCS 2018], a recent thread of work has shown how very minor-looking lower bounds for certain problems on certain models (where we already know strong lower bounds) would imply major separations of complexity classes. In this talk, I will give an overview of work along these lines that I’ve been a part of, namely Murray-McKay-Williams (STOC 2019) and Chen-Jin-Williams (FOCS 2019). At a very high level, these papers show how proving weak-looking circuit lower bounds for *sparse* languages (or “compression” problems) would imply super-polynomial lower bounds for other problems.

**Speaker:** Ramesh Krishnan Pallavoor

**Title:Approximating the Distance to Monotonicity of Boolean Functions**

**Abstract:**We study the problem of approximating the distance to monotonicity of Boolean functions over the hypercube domain. For a function f:{0,1}^n -> {0,1}, the distance to monotonicity of f is the minimum fraction of points at which f has to be modified to make it monotone. We give a nonadaptive algorithm that, given f:{0,1}^n -> {0,1} whose distance to monotonicity is at least \alpha, makes poly(n, 1/\alpha) queries and returns an O(\sqrt{n} polylog n)-multiplicative approximation to the distance to monotonicity of f. Furthermore, we show that for any constant k > 0, approximating the distance to monotonicity up to n^{0.5 – k}-factor requires 2^{n^k} nonadaptive queries, thereby ruling out a poly(n, 1/\alpha)-query nonadaptive algorithm for such approximations. This answers a question of Seshadhri (Property Testing Review ’14) for the case of nonadaptive algorithms. Approximating the distance to a property is equivalent to tolerantly testing that property. Our lower bound stands in contrast to standard (non-tolerant) testing of monotonicity that can be done nonadaptively with O(\sqrt{n} polylog n) queries.

We obtain our lower bound by proving an analogous bound for erasure-resilient testers. Our method yields the same lower bounds for other properties (unateness and being a k-junta). These lower bounds improve exponentially on the existing lower bounds for these properties.

Joint work with Sofya Raskhodnikova and Erik Waingarten, to appear in SODA 2020.

**Speaker:** Talya Eden

**Title:Sampling edges almost uniformly, parameterized by the arboricity**

**Abstract:**In this talk, I will discuss the problem of sampling edges in an unknown graph G = (V, E) from a distribution that is pointwise almost uniform over E. Given query access to a graph G over n vertices, m edges and abound \alpha on the arboricity of G, I will describe an algorithm that performs O\left(\frac{n\alpha}{m}\cdot \frac{\log^3 n}{\varepsilon}\right) queries in expectations and returns an edge in the graph such that every edge e \in E is sampled with probability (1\pm\varepsilon)/m. We also show that the upper bound is tight (up to poly-logarithmic factors and the dependence in \varepsilon). If time will permit I will also discuss a simple algorithm for estimating the number of edges in graphs with bounded arboricity.

This talk is based on joint work with Dana Ron and Will Rosenbaum.

**Speaker:** Ludmila Glinskih

**Title: Lower bounds for Read-Once Branching Programs for Tseitin formulas**

**Abstract:** In this talk, I will describe some lower bounds on Read-Once Branching Program. A branching program is a way of representing a boolean function as an acyclic directed graph with one source and two sinks, where sinks are labeled with 0 and 1, every other vertex is labeled with a variable and every edge is labeled with a 0-1-value. Read-once branching program is a restricted version of branching program in which on every path every variable occurs no more than once.

I will show that any nondeterministic read-once branching program that computes a satisfiable Tseitin formula based on an (n,d,\alpha)-expander with \alpha < 1/3d has size at least 2^{\Omega(n)}. I will also mention lower bounds for Tseitin formulas defined on complete graphs and graphs with large treewidth as well as upper bounds.

This talk is based on a joint work with Dmitry Itsykson.

**Speaker:**Nathan Cordner

**Title:Continuous Facility Location Algorithms for k-Means and k-Median**

**Abstract:**k-means and k-median clustering are classic NP-hard problems that have many applications in computer science. Both problems can be expressed as instances of an uncapacitated facility location (UFL) problem, which seeks to open a minimum cost number of facilities to service a set of clients. In 2001, Jain and Vazirani introduced an algorithm for UFL that guarantees a 3-approximation for k-median and a 9-approximation for k-means—provided that the algorithm is able to open exactly k facilities. In this talk, I will summarize the work of Archer et al. (2003) and Ahmadian et al. (2017) to create a continuous Jain and Vazirani algorithm that can find solutions for all possible values of k.

A summary paper is available at http://cs-people.bu.edu/ncordner/papers/Continuous_JV_Summary.pdf

**Speaker:**Nadya Voronova

**Title:Testing convexity of functions over finite domains**

**Abstract:**In this talk, I will introduce results in testing convexity of the functions over different finite domains(line, hyper-grid, [3]x[n]). We’ll give upper bounds for testing convexity over the line and over [3]x[n] domain. We’ll also discuss lower bounds for testing convexity over the line, over [3]x[n] domain and over [n]^d hyper-grid.

Based on “Testing convexity of functions over finite domains” by Aleksandrs Belovs, Eric Blais, Abhinav Bommireddi.

**Speaker:**Virginia Vassilevska Williams

**Title:Limitations on All Known (and Some Unknown) Approaches to Matrix Multiplication**

**Abstract:**In this talk we will consider the known techniques for designing Matrix Multiplication algorithms. The two main approaches are the Laser method of Strassen, and the Group theoretic approach of Cohn and Umans. We define generalizations that subsume these two approaches: the Galactic and the Universal method; the latter is the most general method there is. We then design a suite of techniques for proving lower bounds on the value of $\omega$, the exponent of matrix multiplication, which can be achieved by algorithms using many tensors $T$ and the Galactic method. Some of our techniques exploit `local’ properties of $T$, like finding a sub-tensor of $T$ which is so `weak’ that $T$ itself couldn’t be used to achieve a good bound on $\omega$, while others exploit `global’ properties, like $T$ being a monomial degeneration of the structural tensor of a group algebra.

The main result is that there is a universal constant $\ell>2$ such that a large class of tensors generalizing the Coppersmith-Winograd tensor $CW_q$ cannot be used within the Galactic method to show a bound on $\omega$ better than $\ell$, for any $q$. We give evidence that previous lower-bounding techniques were not strong enough to show this.

The talk is based on joint work with Josh Alman, which appeared in FOCS 2018, and on Josh Alman’s follow-up paper in CCC’19 in which he shows that the Coppersmith-Winograd tensor cannot yield a better bound on $\omega$ than 2.16805 even using the Universal method. We will not assume any familiarity with the work on matrix multiplication and will strive to give an overview while presenting our results.

**Speaker:** Iden Kalemaj

**Title:Monotonicity testing of Boolean functions and isoperimetric type theorems**

**Abstract:**In this talk we present results in testing monotonicity of Boolean-valued functions over the hypercube. A Boolean function f: {0,1}^n -> {0,1} is said to be epsilon-far from monotone if f needs to be modified in at least epsilon-fraction of points to make it monotone. A randomized tester is given the parameter epsilon and oracle access to f. We consider testers that query f non-adaptively on a small fraction of its domain points, always accept if f is monotone, and reject with a high probability if f is epsilon-far from monotone. Dodis, Goldreich, Lehman, Raskhodnikova, Ron, Samorodnitsky ’99 and Goldreich, Goldwasser, Lehman, Ron, Samorodnitsky ’00 introduced a monotonicity tester that queries f: {0,1}^n -> {0,1} in O(n/epsilon) points. It was a long-standing question whether monotonicity of Boolean f could be tested in o(n)queries. Although not the first to answer this question in the positive, Khot, Minzer, and Safra ’15 introduced a tester that has optimal query complexity in terms of n, making \tilde{O}(\sqrt(n) / epsilon^2) queries. We describe the O(n/epsilon) tester and its analysis in detail. For the \sqrt(n)-tester, we focus on the key ingredient of the analysis: an isoperimetric inequality correlating the average squared-root fraction of “violations” to monotonicity at each domain point, and the distance of f to monotonicity.

**Speaker:**Alley Stoughton

**Title:Experimenting with Formal Languages Using Forlan**

**Abstract:**I will give an introduction to the Forlan formal language (automata) theory toolset, which was designed to facilitate sophisticated experimentation with formal languages (sets of strings over an alphabet (finite set of symbols)). Forlan is implemented in the functional programming language Standard ML (SML), a language whose notation and concepts are similar to those of mathematics. Forlan is used interactively, and users are able to extend Forlan by defining SML functions. I’ll present an extended example of the kind of experimentation that Forlan makes possible. It involves the design of finite state automata using closure properties/algorithms for regular languages/finite automata.

**Speaker:** Paul Valiant

**Title:Four Challenges in Learning**

**Abstract:**In this talk I will motivate and discuss solutions to four learning challenges: How to efficiently optimize beyond the class of convex functions; How to get good performance in deep learning despite overfitting training data; How to efficiently count rare items in data; and, How to make predictions when the future is not like the past.

The first result shows how to extend polynomial-time optimization algorithms from the classic case of convex optimization to a broader class of “star-convex” functions, which capture some natural non-convex settings in learning. The algorithm is a randomized adaptation of the ellipsoid algorithm which, unexpectedly, uses information from far outside the feasible region. The second result confronts one of the main challenges in deep learning: understanding why “overparameterized” models that can perfectly fit the training data still have such good performance on new data. We introduce a new notion of implicit regularization that arises by comparing stochastic gradient descent to an Ornstein-Uhlenbeck process. The third result considers the problem of rapidly estimating the frequency of a rare and expensive-to-verify occurrence in a database, such as a data error that might require human judgement to detect. We introduce an adaptive algorithm that abandons further investigation of items that look potentially not-rare, without introducing bias in the estimate. Finally, we consider a fundamental and perennial challenge for machine learning: as opposed to assuming that the test set and training set were drawn from the same distribution, suppose we instead know that some different probabilistic process took place to produce the training data and test data. We introduce a new framework based on a semidefinite-programming relaxation to yield almost-optimal linear estimators in the case of mean estimation in the challenging setting where an adversary chooses data values.

## Fall 2018

Date | Speaker | Talk Title |

September 10 | Shreyas Sekar (Harvard) | Optimal Information Design for Selfish Routing – Exploiting Uncertainty and Heterogeneity for Social Good |

September 24 | Jon Ullman (Northeastern) | Privately Learning High-Dimensional Distributions |

October 1 | Nathan Cordner (BU) | Primal-Dual Algorithms for Clustering and Feature Allocation |

October 9 | Suvrit Sra (MIT) | A critical view of optimality in deep learning |

October 15 | Nithin Varma (BU) | Separating erasures and errors in property testing using local list decoding |

October 22 | Jelani Nelson (Harvard) | BPTree: an l_2 heavy hitters algorithm using constant memory |

October 29 | Erasmo Tani (BU) | From Discrete to Continuous (and back) through submodular functions |

November 5 | Constantinos Daskalakis (MIT) | |

November 12 | Barna Saha (UMass Amherst) | Distinguished lecture. |

November 19 | Seth Neel (Penn) | |

November 26 | Michael Newman (Duke) | |

December 3 | Adrian Vladu (BU) | |

December 10 | Raef Bassily (OSU) |

### Abstracts

**Speaker:** Erasmo Tani

**Title:** **From Discrete to Continuous (and back) through submodular functions.**

**Abstract:** Several recent results in theoretical computer science have emerged from the synergy between continuous and discrete ideas. In this talk, we will discuss how functions with diminishing returns properties provide an interface between these two worlds. We will investigate how submodular set functions generalize to continuous domains, and how algorithmic techniques are adapted to deal with these changes.

**Bio:** Erasmo joined the Department of Computer Science at BU last year, and is working with Lorenzo Orecchia and Alina Ene on optimization algorithms. Prior to joining BU, he got his Masters degree in Computer Science and Mathematics from the University of Bristol in the UK.

**Speaker:** Jelani Nelson

**Title:**BPTree**: an l_2 heavy hitters algorithm using constant memory**

**Abstract:** In the ‘frequent items’ problem one sees a sequence of items in a

stream (e.g. a stream of words coming into a search query engine like

Google) and wants to report a small list of items containing all

frequent items. We would like algorithms for this problem that use

memory substantially sublinear in the length of the stream.

In this talk, we describe a new state-of-the-art solution to this

problem, called the “BPTree”. We make use of chaining methods to

control the suprema of Rademacher processes to develop this new

algorithm which has provably near-optimal memory consumption for the

“l_2” heavy hitters problem, improving upon the CountSieve and

CountSketch from previous work.

Based on joint work with Vladimir Braverman, Stephen Chestnut, Nikita

Ivkin, Zhengyu Wang, and David P. Woodruff.

**Bio:** Jelani Nelson is Associate Professor of Computer Science and John L.

Loeb Associate Professor of Engineering and Applied Sciences at

Harvard. His main research interest is in algorithm design and

analysis, with a focus on streaming algorithms, dimensionality

reduction, compressed sensing, and randomized linear algebra

algorithms. He completed his Ph.D. in computer science at MIT in 2011,

receiving the George M. Sprowls Award for best computer science

doctoral dissertations at MIT. He is the recipient of an NSF CAREER

Award, ONR Young Investigator Award, ONR Director of Research Early

Career Award, Alfred P. Sloan Research Fellowship, and Presidential

Early Career Award for Scientists and Engineers (PECASE).

**Speaker:** Nithin Varma

**Title: Separating erasures and errors in property testing using local list decoding**

Abstract: Corruption in data can be in the form of erasures (missing data) or errors (wrong data). Erasure-resilient property testing (Dixit, Raskhodnikova, Thakurta, Varma ’16) and tolerant property testing (Parnas, Ron, Rubinfeld ’06) are two formal models of sublinear algorithms that account for the presence of erasures and errors in input data, respectively.

We first show that there exists a property P that has an erasure-resilient tester whose query complexity is independent of the input size n, whereas every tolerant tester for P has query complexity that depends on n. We obtain this result by designing a local list decoder for the Hadamard code that works in the presence of erasures, thereby proving an analog of the famous Goldreich-Levin Theorem. We also show a strengthened separation by proving that there exists another property R such that R has a constant-query erasure-resilient tester, whereas every tolerant tester for R requires n^{Omega(1)} queries. The main tool used in proving the strengthened separation is an approximate variant of a locally list decodable code that works against erasures.

Joint work with Sofya Raskhodnikova and Noga Ron-Zewi.

**Short Bio:** Nithin Varma is a 5th year Ph. D. student working with Dr. Sofya Raskhodnikova at Boston University. He is broadly interested in theoretical computer science, especially in algorithms. His research at BU has focussed on formalizing and understanding fault-resilient computational models for large data algorithms.

**Speaker:** Suvrit Sra

**Title: A critical view of optimality in deep learning**

**Abstract:**In this talk, we will look at the loss surface of neural networks. We prove that even for one- hidden-layer networks with “slightest” nonlinearity, the empirical risks have spurious local minima in most cases. Our results thus indicate that in general “no spurious local minima” is a property limited to deep linear networks, and insights obtained from linear networks are not robust. Specifically, for ReLU(-like) networks we constructively prove that for almost all (in contrast to previous results) practical datasets there exist infinitely many local minima. Time permitting, I will also touch upon a comprehensive characterization of global optimality for deep linear networks, which unifies other resultson this topic.

**Bio:** Suvrit Sra is a faculty member within the EECS department at MIT, where he is also a core faculty member of IDSS, and a core member of LIDS, MIT-ML, and the statistics and data science center. His research spans topics in optimization, matrix theory, differential geometry, and probability theory, which he connects with machine learning — and a key focus of his research is on the theme “Optimization for Machine Learning”.

**Speaker:** Nathan Cordner

**Title: Primal-Dual Algorithms for Clustering and Feature Allocation**

**Abstract**:Finding clusters in a data set is an important problem with many applications, especially in machine learning and data mining. Popular clustering algorithms such as K-means can be very quick in partitioning a data set, but are not without their drawbacks. One problem is that K-means requires prior knowledge of the number of clusters in the data set. Additionally, adaptations of K-means to solve the related feature allocation problem can be quite slow.

In a 2001 paper, Jain and Vazirani introduced a primal-dual algorithm to solve the metric uncapacitated facility location problem. In this talk I will discuss their algorithm, and how to adapt it to find clusters without needing to set a parameter. I will also discuss ongoing research for using the primal-dual approach to efficiently solve the feature allocation problem.

**Bio:**Nathan Cordner is a second-year computer science PhD student at Boston University, with general research interests in algorithms and optimization. Prior to coming to BU, he completed bachelor’s and master’s degrees in mathematics at Brigham Young University in Provo, Utah and worked for a year as a software engineer at Vecna Technologies in Cambridge, Massachusetts. Outside of academic work, Nathan enjoys playing the piano and the french horn, reading his wife’s theology books, and exploring the historic sites around Boston.

**Speaker:** Jon Ullman

**Title:Privately Learning High-Dimensional Distributions**

**Abstract:** We design nearly novel, computationally efficient algorithms for learning two fundamental families of high-dimensional distributions in total variation distance: multivariate Gaussians and product distributions over the Boolean hypercube. The sample complexity of both our algorithms approaches the sample complexity of non-private learners up to a small multiplicative factor for a wide range of parameters. Thus, our results show that private comes essentially for free for these problems, providing a counterpoint to the many negative results showing that privacy is often costly in high dimensions. Our algorithms use a novel technical approach to reducing the sensitivity of the estimation procedure that we call recursive private preconditioning and may find additional applications.

Joint work with Gautam Kamath, Jerry Li, and Vikrant Singhal.

**Bio:** Jon Ullman is an assistant professor in the College of Computer and Information Science at Northeastern University. His research in theoretical computer science sits at the intersection of privacy, machine learning, and cryptography. He has been recognized with an NSF CAREER Award and a Google Faculty Research Award.

**Speaker:** Shreyas Sekar

**Title:****Optimal Information Design for Selfish Routing – Exploiting Uncertainty and Heterogeneity for Social Good**

**Abstract:**Motivated by the increasingly proactive role played by traffic navigation apps (e.g., Google Maps, Waze) in urban transportation, we consider the problem of disseminating information to minimize network congestion. Formally, we study anetworked routing game where infinitesimal users route their flow in the presence of uncertainty regarding the edge costs. Given this setup, we focus on the “optimal information design” problem, where a planner disseminates(partial)information to the users on the network costs. The planner’s objective is to minimize the overall social cost at the equilibrium solution that is formed when users select routes based on the provided information.

We show how to design optimal information schemes, both deterministic and randomized, under a number of reasonable constraints. Specifically, we show the following results:(i)providing conservative cost estimates to a small fraction of users always leads to an improvement in social cost,(ii)using simple information schemes can lower the price of anarchy from 4/3 to 8/7 when the costs are linear. Overall, our results reveal surprising trends on how policies that exploit the diversity and uncertainty in users’ beliefs outperform those that always present users with full information.

Joint work with Liyuan Zheng, Lillian Ratliff, and Baosen Zhang.

**Bio:**Shreyas Sekar is a Postdoctoral Fellow at the Laboratory for Innovation Science at Harvard (affiliated with SEAS and HBS). Shreyas received his Ph.D. in Computer Science from Rensselaer Polytechnic Institute in 2017, where he was awarded the Robert McNaughton Prize for the best outgoing graduate student. His dissertation was on Non-Discriminative Algorithmic Pricing. From 2017-2018, he was a Postdoctoral Researcher at the University of Washington, Seattle.Broadly speaking, his academic interests lie at the intersection of theoretical computer science, market design, social choice, and recently, online learning. A large chunk of his research centers around problems arising in computational economics and its impact on society. Currently, Shreyas is excited by questions pertaining to information design for multi-agent systems — (a) can we strategically present information to self-interested users to maximize social welfare without compromising on fairness?, and (b) is it possible to leverage uncertainty for social good?