Tim posing with a recurrent neural network

Tim Cooijmans

I am an ML&AI researcher interested in the dynamics of learning systems, such as agents updating their beliefs, or their parameters using gradient descent. My focus is on stability of learning, by which I mean that the agent maintains its ability to pursue its goals. With Recurrent Batch Normalization, I stabilized the hidden state process of recurrent systems, dramatically improving their training and generalization.

I am currently working in the area of multi-agent reinforcement learning (MARL), which exaggerates some of the difficulties of single-agent (and supervised) learning, and introduces several new ones. In my view, the failure of gradient descent on MARL problems is fundamental, and its study will lead to new perspectives and opportunities on learning in general. In Meta-Value Learning, I propose an algorithm that generates a surrogate (the meta-value) on which gradient descent is stable, and, I suspect, finds global (Pareto) optima in principle.


Publications

Advantage Alignment Algorithms: simple, cheap, effective opponent shaping


Juan Agustin Duque, Milad Aghajohari, Tim Cooijmans, Razvan Ciuca, Tianyu Zhang, Gauthier Gidel, Aaron Courville. Submitted to ICLR 2025.

ReaLchords: tailoring generative models for jamming with RL fine-tuning and distillation


Yusong Wu, Tim Cooijmans, Kyle Kastner, Adam Roberts, Ian Simon, Alexander Scarlatos, Chris Donahue, Cassie Tarakajian, Shayegan Omidshafiei, Aaron Courville, Pablo Samuel Castro, Natasha Jaques, Cheng-Zhi Anna Huang. ICML 2024.

Learning with Opponent Q-Learning Awareness: solving social dilemmas by influencing opponent Q-values


Milad Aghajohari, Juan Augustin Duque, Tim Cooijmans, Aaron Courville. ICLR 2024.

Meta-Value Learning: solving social dilemmas with a novel general meta-learning framework.


Tim Cooijmans, Milad Aghajohari, Aaron Courville. ICML 2023 Frontiers workshop.

Best-Response Shaping: solving social dilemmas by differentiating through the best response.


Milad Aghajohari, Tim Cooijmans, Juan Augustin Duque, Shunichi Akatsuka, Aaron Courville. RLC 2024.

SUNMASK: Mask-Enhanced Control in Step-Unrolled Denoising Autoencoders.


Kyle Kastner, Tim Cooijmans, Yusong Wu, Aaron Courville. EvoMUSART 2023.

MIDI-DDSP: detailed control of musical performance via hierarchical modeling.


Yuson Wu, Ethan Manilow, Yi Deng, Rigel Swavely, Kyle Kastner, Tim Cooijmans, Aaron Courville, Cheng-Zhi Anna Huang, Jesse Engel. ICLR 2022.

Harmonic Recomposition using Conditional Autoregressive Modeling.


Kyle Kastner, Rithesh Kumar, Tim Cooijmans, Aaron Courville. ICML 2018 workshop.

Coconet: the ML model behind today’s Bach Doodle.


Cheng-Zhi Anna Huang, Tim Cooijmans, Monica Dinculescu, Adam Roberts, Curtis Hawthorne. Blog post, 2018.

Memorization in Recurrent Neural Networks.


Tegan Maharaj, David Krueger, Tim Cooijmans. ICML 2017 PADL workshop.

Counterpoint by Convolution: a convolutional model of Bach’s chorales with Gibbs sampling strategy.


Cheng-Zhi Anna Huang, Tim Cooijmans, Adam Roberts, Aaron Courville, Douglas Eck. ISMIR 2017.

Recurrent Batch Normalization: the first successful use of normalization in RNN transitions.


Tim Cooijmans, Nicolas Ballas, César Laurent, Çağlar Gülçehre, Aaron Courville. ICLR 2017.

Dynamic Capacity Networks: models that dynamically allocate capacity to salient features.


Amjad Almahairi, Nicolas Ballas, Tim Cooijmans, Yin Zheng, Hugo Larochelle, Aaron Courville. ICML 2016.

Monte-Carlo simulations of the radiation environment for the CMS experiment.


Sophie Mallows, Igor Azhgirey, Igor Bayshev, Ida Bergstrom, Tim Cooijmans, Anne Dabrowski, Lisa Glöggler, Moritz Guthoff, Igor Kurochkin, Helmut Vincke, S Tajeda. Elba2015 13th Pisa Meeting on Advanced Detectors.