CSAIL Forum | MIT CSAIL

PAST EVENT VIDEOS

Programming the way to better AI
Armando Solar-Lezama

13 May 2025 12 - 1:00pm

For decades, programming was the way through which we told machines what to do, but modern AI techniques promise new ways of creating software directly from data and natural language. But programming has a number of advantages that have enabled us to build reliable large scale computing infrastructure. In this presentation, I explain some new approaches to learn from data while preserving some of the benefits of programming, and some of their applications in domains ranging from robotics to computational biology.

FORUM VIDEO

The role of information diversity in AI systems
Manish Raghavan

6 May 2025 12 - 1:00pm

Manish Raghavan is the Drew Houston (2005) Career Development Professor at the MIT Sloan School of Management and Department of Electrical Engineering and Computer Science. Before that, he was a postdoctoral fellow at the Harvard Center for Research on Computation and Society (CRCS). His research centers on the societal impacts of algorithms and AI.

FORUM VIDEO

Efficient and Expressive Architectures for Language Modeling
Yoon Kim
Assistant Professor, CSAIL
22 April 2025 12 - 1:00pm

Transformers are the dominant architecture for language modeling (and generative AI more broadly). The attention mechanism in Transformers is considered core to the architecture and enables accurate sequence modeling at scale. However, the complexity of attention is quadratic in input length, which makes it difficult to apply Transformers to model long sequences. Moreover, Transformers have theoretical limitations when it comes to the class of problems it can solve, which prevents their being able to model certain kinds of phenomena such as state tracking. This talk will describe some recent work on efficient alternatives to Transformers which can overcome these limitations.

FORUM VIDEO

The Platonic Representation Hypothesis
Phillip Isola
Associate Professor, CSAIL
15 April 2025 12:00 - 1:00pm

I will argue that representations in different deep nets are converging. First, I will survey examples of convergence in the literature: over time and across multiple domains, the ways by which different neural networks represent data are becoming more aligned. Next, I will demonstrate convergence across data modalities: as vision models and language models get larger, they measure distance between datapoints in a more and more alike way. I will hypothesize that this convergence is driving toward a shared statistical model of reality, akin to Plato's concept of an ideal reality. We term such a representation the platonic representation and discuss several possible selective pressures toward it. Finally, I'll discuss the implications of these trends, their limitations, and counterexamples to our analysis.

FORUM VIDEO