March 17

December 09

October 28

October 14

Add to Calendar 2025-10-14 16:15:00 2025-10-14 17:15:00 America/New_York Introducing Algorithmic Thinking Theory for Foundation Models The last few months have witnessed tremendous advances on Large Language Model (LLM) reasoning capabilities with Gemini and GPT winning a gold medal at the International Mathematical Olympiad (IMO) [1] and International Collegiate Programming Contest (ICPC) [2]. Several papers have shown that inference scaling techniques significantly improve the reasoning performances of the LLM, in particular for the IMO [3]. We will discuss these results and how one can frame the problem as an optimization problem, relate it to empirical results shown in [3], and derive optimal (algorithmic) thinking strategies. We will also discuss avenues for refining the model and improving inference scaling methods.[1] https://deepmind.google/discover/blog/advanced-version-of-gemini-with-deep-think-officially-achieves-gold-medal-standard-at-the-international-mathematical-olympiad/[2] https://deepmind.google/discover/blog/gemini-achieves-gold-level-performance-at-the-international-collegiate-programming-contest-world-finals/[3] https://arxiv.org/abs/2507.15855  TBD

October 07

Add to Calendar 2025-10-07 16:15:00 2025-10-07 17:15:00 America/New_York On Beck-Fiala and Komlós Conjectures A conjecture of Komlós states that the discrepancy of any collection of unit vectors is O(1), i.e., for any matrix A with unit columns, there is a vector x with -1,1 entries such that |Ax|_\infty = O(1). The related Beck-Fiala conjecture states that any set system with maximum degree k has discrepancy O(k^{1/2}).I will describe an O((log n)^{1/4}) bound for the Komlós problem, improving upon an O((log n)^{1/2}) bound due to Banaszczyk, using the framework of discrete Brownian motion guided by semidefinite programs. Time permitting, I will sketch how these ideas can be used to resolve the Beck-Fiala conjecture for k >=(log n)^2. TBD

September 23

Add to Calendar 2025-09-23 16:15:00 2025-09-23 17:15:00 America/New_York Explicit Lossless Vertex Expanders We give the first explicit construction of lossless vertex expanders. These are d-regular graphs where every small set S of vertices has (1-eps)d|S| distinct neighbors. Previously, the strongest known explicit vertex expanders were those given by Ramanujan graphs, whose spectral properties imply that every small set S of vertices has 0.5d|S| distinct neighbors.Based on joint work with Jun-Ting Hsieh, Ting-Chun Lin, Alex Lubotzky, Sidhanth Mohanty, Ryan O'Donnell, and Assaf Reiner. TBD

September 16

Add to Calendar 2025-09-16 16:15:00 2025-09-16 17:15:00 America/New_York Sparsification of 1-in-3-SAT I will introduce a new notion of sparsification that doesn't drop constraints but merges variables. Using tools from additive combinatorics, I will then show that 1-in-3-SAT admits a sub-quadratic sparsifier. As an application, I will present an improved approximation algorithm for finding a linearly-ordered colouring of 3-uniform hypergraphs. Based on joint work with B. Bedert, T.-V. Nakajima, and K. Okrasa, to appear in FOCS'25. TBD

September 09

Add to Calendar 2025-09-09 16:15:00 2025-09-09 17:15:00 America/New_York A New Paradigm for Learning with Distribution Shift We revisit the fundamental problem of learning with distribution shift, where a learner is given labeled samples from training distribution D, unlabeled samples from test distribution D′ and is asked to output a classifier with low test error. The standard approach in this setting is to prove a generalization bound in terms of some notion of distance between D and D′. These distances, however, are difficult to compute, and this has been the main stumbling block for efficient algorithm design over the last two decades.We sidestep this issue and define a new model called TDS learning, where a learner runs a test on the training set and is allowed to reject if this test detects distribution shift relative to a fixed output classifier.  This approach leads to the first set of efficient algorithms for learning with distribution shift that do not take any assumptions on the test distribution.  Finally, we discuss how our techniques have recently been used to solve longstanding problems for supervised learning with contamination. TBD