Session 2: Contributed Talks

Salva Rühling Cachay

A Dynamics-informed Diffusion Model for Weather and Climate Prediction.

We introduce a novel framework, DYffusion, for large-scale probabilistic forecasting. It harnesses the power of diffusion models while fundamentally reimagining their temporal dynamics. By directly coupling the temporal dynamics of the data with the diffusion process, our approach overcomes key limitations of traditional diffusion models. DYffusion achieves significantly faster sampling speeds and reduced memory requirements while generating stable and accurate probabilistic forecasts over extended time horizons. We showcase its effectiveness through an application to climate model emulation, where it successfully generates fast and accurate 10- to 100-year-long global climate simulations at 6-hourly resolution.

Bio:
Salva Rühling Cachay is a third-year PhD student at UC San Diego working on generative modeling and AI for science under the guidance of Prof. Rose Yu and Prof. Duncan Watson-Parris. His research focuses on developing innovative AI solutions for weather and climate modeling, aiming to advance our understanding and prediction of Earth's complex systems. His work has garnered recognition at top AI conferences, including the best paper award at the ML for Earth System Modeling workshop at ICML 2024. His expertise has led to fruitful research collaborations with leading institutions, including internships at NVIDIA and the Allen Institute for AI (Ai2).

Zachary Novack

Presto! Distilling Steps and Layers for Accelerating Music Generation

Despite advances in diffusion-based text-to-music (TTM) methods, efficient, high-quality generation remains a challenge. We introduce Presto!, an approach to inference acceleration for score-based diffusion transformers via reducing both sampling steps and cost per step. To reduce steps, we develop a new score-based distribution matching distillation (DMD) method for the EDM-family of diffusion models, the first GAN-based distillation method for TTM. To reduce the cost per step, we develop a simple, but powerful improvement to a recent layer distillation method that improves learning via preserving hidden state variance. Finally, we combine our improved step and layer distillation methods together for a dual-faceted approach. We evaluate our step and layer distillation methods independently and show each yield best-in-class performance. Furthermore, we find our combined distillation method can generate high-quality outputs with improved diversity accelerating our base model by 10-18x (32 second output in 230ms, 15x faster than the comparable SOTA model) -- the fastest high-quality TTM model to our knowledge.

Bio:
Zachary Novack is a 3rd year PhD student at UC San Diego, working on generative models and AI for Audio and Music under the advising of Julian McAuley and Taylor Berg-Kirkpatrick. Zachary’s research focuses on building controllable and efficient generative music systems, enabling bespoke artistic control axes at lightning fast speeds for musicians and everyday users alike. His research has been recognized at top AI conferences, including an ICML 2024 Oral and an upcoming ICLR 2025 Spotlight. He has worked with leading industry labs in the generative audio space, including Adobe Research and Stability AI. In his free time, Zachary likes to play beach volleyball and teaches the 11-time world finalist indoor percussion ensemble POW Percussion.