Mandana Samiei

I am a 5th year PhD student in Computer Science at Mila - Quebec AI Institute and Reasoning and Learning Lab at McGill University in Montréal, where I am advised by Prof. Doina Precup and Prof. Blake Richards . I am interested in designing AI systems that are able to learn and adapt continually, instead of mastering specific tasks they are able to learn general skills/features which can be reused given a new task/dataset/environment. I study the underlying models of such adaptive and robust generalizations.

Just as humans can continuously adapt, can AI agents achieve similar abilities?

My research centers on exploring how AI agents can efficiently learn abstractions of intricate sensory observations of the world, enabling them to reason, plan, and make decisions. Recently, I find myself more and more interested in understanding the role of structure in designing adaptable agents, which can recieve observations/images and output actions/policy or human readable texual content.

Email     CV     GitHub     Scholar     BlueSky     Twitter     LinkedIn

And, finally I would like to share a quote from Jean Piaget whose research has inspired me: "Scientific knowledge is in perpetual evolution; it finds itself changed from one day to the next".

profile photo

Highlights & News




Selected Research and Publications

Learning Schemas in Reinforcement Learning: bottleneck structure discovery.

Mandana Samiei, Doina Precup and Blake A. Richards

In this work we show how schemas can be learned in RL by discovering the bottleneck structure of the task.

Under submission Nature communications.

The Schema Spectrum: Explicit, Implicit, and Emergent Structures in AI and the Brain.

Mandana Samiei, Doina Precup and Blake A. Richards

This perspective uses recent results in generative AI models to reconsider two standard assumptions in schema theory: (1) that schemas exist as explicit objects in the brain, and (2) that schemas are categorically distinct from episodic memories. This is based on the observation that large language models exhibit many phenomena reminiscent of schematic learning in the absence of any explicit engineering as such, suggesting that schemas may be an emergent property of distributed representations in neural networks.

Under review at Neuron.

Language Agents Mirror Human Causal Reasoning Biases. How Can We Help Them Think Like Scientists?

Language model (LM) agents exhibit human-like biases when causally exploring. We compare this to human data. We also develop a scalable test-time sampling algorithm to fix this, by sampling hypotheses as code and acting to eliminate them.

Anthony GX-Chen, Dongyan Lin*, Mandana Samiei*, Doina Precup, Blake Richards, Rob Fergus, Kenneth Marino

*equal contribution, ordered alphabetically. Presenters are shown with underline.

Presented at Conference on Language Modelling (COLM) 2025.

AIF-GEN: Open-Source Platform and Synthetic Dataset Suite for Reinforcement Learning on Large Language Models

Jacob Chmura, Shahrad Mohammadzadeh, Ivan Anokhin, Jacob-Junqi Tian, Mandana Samiei, Taz Scott-Talib, Irina Rish, Doina Precup, Reihaneh Rabbany, Nishanth Anand

Presented at Proceedings of the ICML 2025 Workshop on Championing Opensource Development in Machine Learning (CODEML'25) .

Presenters are shown with underline.

The Role of Schemas in Reinforcement Learning: Insights and Implications for Generalization

Mandana Samiei, Doina Precup and Blake A. Richards

In cognitive psychology, schemas are considered to be "building blocks" of cognition, shaping how people view the world and interact with it. The goal of this paper is to propose a method for learning schemas in RL. We argue that, by representing tasks through schemas, agents can more effectively generalize from past experiences and adapt to new, unseen environments with minimal data.

Presented at Reinforcement Learning and Decision Making - RLDM 2025 .

A conceptual analysis of continual learning objectives.

Giulia Lanzillotta, Mandana Samiei, Claire Vernade, and Razvan Pascanu

Continual Learning solutions often treat multitask learning as an upper-bound of what the learning process can achieve. This is a natural assumption, given that this objective directly addresses the catastrophic forgetting problem, which has been a central focus in early works. However, depending on the nature of the distributional shift in the data, the multi-task solution is not always optimal for the broader continual learning problem. We draw on principles from online learning to formalize the limitations of multitask objectives.

Under submission TMLR

Testing Causal Hypotheses Through Hierarchical Reinforcement Learning

Anthony Chen*, Dongyan Lin*, Mandana Samiei*

*equal contribution, ordered alphabetically. Presenters are shown with underline.

we propose hierarchical reinforcement learning (HRL) as a key ingredient to building agents that can systematically generate and test hypotheses that enables transferrable learning of the world, and discuss potential implementation strategies.

Presented at Intrinsically Motivated Open-ended Learning - IMOL Workshop at NeurIPS 2024 as a poster.

Torchmeta: A meta-learning library for pytorch

Tristan Deleu, Tobias Würfl, Mandana Samiei, Joseph Paul Cohen, Yoshua Bengio

We introduce Torchmeta, a library built on top of PyTorch that enables seamless and consistent evaluation of meta-learning algorithms on multiple datasets, by providing data-loaders for most of the standard benchmarks in few-shot classification and regression, with a new meta-dataset abstraction.

Presented at PyTorch Developer Conference (PTDC)

Towards Efficient Generalization in Continual RL using Episodic Memory

As part of a collaboration between Microsoft Research Research and Mila, work done with Ida Momennejad, Geoffrey J. Gordon, John Langford, Mehdi Fatemi, Blake A. Richards, and Guillaume Lajoie. Gave an invited talk on Towards Efficient Generalization in Continual RL using Episodic Memory at Microsoft Research Summit 2021. Slides.

Organizing Committee

collas

D&I chair at Fourth Conference on Lifelong Learning Agents - CoLLAs 2025

Local chair at Second Conference on Lifelong Learning Agents - CoLLAs 2023

Organizer at First Conference on Lifelong Learning Agents - CoLLAs 2022

wiml

Senior organizer with Eda Okur - 19th Women in Machine Learning Workshop NeurIPS 2024

Senior organizer with Caroline Weis - Women in Machine Learning Symposium ICML 2024

Senior organizer - Women in Machine Learning Un-workshop ICML 2023

Organizer - Women in Machine Learning Un-workshop ICML 2020

wiml Organizer at Machine Learning Reproducibility Challenge - MLRC 2023

Teaching

m2l Tutorial on Large Language Models Tutorial, M2L 2025

Mediterranean Machine Learning Summer School (M2L) at Split, Croatia, 8-12 September 2025

eeml EEML PyTorch and Colab intro, Summer 2024

Taught by Nemanja Rakićević and Matko Bošnjak

mcgill Teaching Assistant: COMP 767 Reinforcement Learning, Winter 2021

Taught by Prof. Doina Precup

Teaching Assistant: COMP 417 Intro to Robotics & Intelligent Systems, Fall 2020

Taught by Prof. Dave Meger

udem Teaching Assistant: IFT 6390 Fundamentals of Machine Learning , Fall 2021

Taught by Prof. Ioannis Mitliagkas and Prof. Guillaume Rabusseau

Reviewing

  • Conference on Neural Information Processing Systems - NeurIPS 2025
  • Reinforcement Learning Conference - RLC 2024 & 2025
  • Conference on Lifelong Learning Agents - CoLLAs 2022 & 2024 & 2025
  • The International Conference on Learning Representations - ICLR 2023 & 2024
  • Transactions on Machine Learning Research - TMLR 2023 & 2024 & 2025
  • Conference on Neural Information Processing Systems - NeurIPS 2020 & 2023
  • Generative Models for Decision Making workshop at ICLR 2024
  • Decision Awareness in Reinforcement Learning (DARL) at ICML 2022

Mentorship




Credit to Jon Barron for the template.
Last Update on Jan 2025.