**STOC
2020 Workshop: Recent advances in Discrepancy and Applications**

**Organizers:** Nikhil Bansal (bansal@gmail.com)

Aleksandar Nikolov (anikolov@cs.toronto.edu)

**Description**

Discrepancy theory is a branch of combinatorics
with several applications to other areas of pure and applied mathematics, and
to TCS. Classically, discrepancy theory has been used to construct uniformly
distributed point sets, for example epsilon-nets in computational geometry. In
recent years there has been a lot of progress on several aspects of
discrepancy, resulting in new algorithmic techniques and various connections to
convex geometry and probability. There have also been surprising applications
to topics such as rounding in approximation algorithms, reconstruction attacks
in differential privacy, coreset design for kernel
density estimation in machine learning, randomized experiment design and so on.
The aim of the workshop is to give an overview of these developments, with a focus on recent developments and open
directions.

Besides introducing non-experts to the area, one
of the main goals of the workshop is to present the latest algorithmic
developments in discrepancy theory, and highlight open questions, and
connections to other areas, hopefully leading to interesting new research.

Date: Tuesday, June 23, 2020

Time: noon – 3pm CDT (1-4 pm EST, 10am-1pm PST)

Talk will talk place online via Zoom.

The livestream link

12:00 - 12:45 Nikhil Bansal (slides)

12:45 - 13:30 Aleksandar
Nikolov (slides)

13:30 - 14:00 Zohar Karnin
(slides)

14:00 - 14:30 Peng Zhang (slides)

14:30 – 15:00 Aditya Potukuchi

**Title and Abstracts:**

**Nikhil Bansal (CWI)**

**Title: Algorithmic discrepancy, geometry and
applications**

Abstract: We will
start with the basic concepts in discrepancy theory and describe some classic
applications in areas such as approximation and sparsification.
We then consider discrepancy from a
convex geometric and vector-balancing point of view. This view has been
extremely useful in the algorithmic progress on discrepancy in recent years,
often matching or even improving upon previously known non-constructive bounds.
I will give an overview of the these recent results, and the key underlying
algorithmic ideas behind them.

**Aleksandar**** Nikolov (Toronto)**

**Title: Hereditary
Discrepancy and Factorization Norms**

Abstract: Hereditary discrepancy is a robust
version of combinatorial discrepancy, equal to the maximum discrepancy of any
restriction of the original set system. Hereditary discrepancy is stable under
simple operations: for example, it behaves nicely under taking restrictions,
unions or duals of the set system. We will discuss a connection between
hereditary discrepancy and factorization norms of matrices, which implies these
stability properties, gives an efficient approximation algorithm for hereditary
discrepancy, and gives a powerful tool for proving upper and lower bounds on
discrepancy. We will also sketch applications of hereditary discrepancy to
rounding algorithms, to construction of uniformly distributed point sets, and
to differential privacy.

**Zohar Karnin (Amazon)**

Title: Discrepancy, Coresets, and Sketches in Machine Learning

Abstract: In this talk we discuss applications of discrepancy to corsets,
machine learning and streaming algorithms. The literature on discrepancy allows
to partition a set of vectors or class of functions in a highly balanced
manner, surpassing the guarantees of a random partition. A coreset
is defined in the context of an optimization task over a set of points and
consists of a subset that represents the whole, typically better than a random
subset. We review the connection between low discrepancy partitions and coresets, and how an efficient solution to one can be
applied to the other. We show how this connection is also helpful in the
context of streaming algorithms, and in machine learning where discrepancy
methods can be used to provide meaningful generalization bounds for coresets.

**Peng Zhang (Yale) **

**Title: Balancing covariates in randomized
experiments using the Gram–Schmidt walk **

Abstract: In randomized experiments, such as a
medical trials, we randomly assign the treatment, such as a drug or a placebo,
that each experimental subject receives. Randomization can help us accurately
estimate the difference in treatment effects with high probability. We also
know that we want the two groups to be similar: ideally the two groups would be
similar in every statistic we can measure beforehand. Recent advances in
algorithmic discrepancy theory allow us to divide subjects into groups with similar
statistics. By exploiting the recent Gram-Schmidt Walk algorithm of Bansal, Dadush, Garg, and Lovett, we can obtain random assignments
of low discrepancy. These allow us to obtain more accurate estimates of
treatment effects when the information we measure about the subjects is
predictive, while also bounding the worst-case behavior when it is not.

We will explain the experimental design
problem we address, how we use and analyze the Gram-Schmidt walk algorithm, and
remaining open problems. This is joint work with Chris Harshaw,
Fredrik Sävje and Daniel Spielman.

Paper: https://arxiv.org/abs/1911.03071

Code: https://github.com/crharshaw/GSWDesign.jl

**Aditya Potukuchi (Rutgers)**

**Title:
Hypergraph discrepancy in random and pseudorandom settings**

Abstract: The Beck-Fiala
conjecture states that for a t-regular hypergraph (every vertex is in t hyperedges), the discrepancy is at most O(sqrt{t}). The seeming difficulty in making any progress
towards resolving this has led to the study of the discrepancy of random
t-regular hypergraphs with n vertices and m hyperedges.
Here, Ezra and Lovett (RANDOM 2015) showed that the discrepancy is at most O(sqrt{t log t}) almost surely as t
grows and n = O(m). Later, a result of Bansal and Meka
(SODA 2019) shows that the discrepancy is almost surely O(sqrt{t})
almost surely as n grows and t = Omega(log^2 log m).

We will see a proof that for every t > 0, the
discrepancy of a random t-regular hypergraph is almost surely O(sqrt{t}) as n grows and n = O(m). The proof is algorithmic
and works more generally because it is based on a spectral ``pseudorandomness'' condition of the incidence matrix of the
hypergraph, which holds with high probability.

If time permits, we will also see discrepancy in
random hypergraphs where every point-edge incidence is i.i.d
Bernoulli.