ATLAS

Google DeepMind active-learning framework that closes the hypothesis-generation ↔ experiment-design loop to discover interpretable mechanistic models of behavior in cognitive science.


Affiliation	Google DeepMind, with Princeton / Columbia / UCL (paper)
First introduced	2026-06 (arXiv:2606.12386, dated 2026-06-10)
Lifecycle stages	Multi-stage — iterates data-driven hypothesis generation, optimal experiment design, and analysis of the resulting behavioral data
Autonomy level	Semi-autonomous — automates the hypothesize→design→collect loop; the human researcher interprets the discovered interpretable models
Domain focus	Cognitive science / computational behavioral modeling (test case: recovering reinforcement-learning agents from behavior)
Availability	No code released as of 2026-06-11 (none stated in the preprint)

Approach

ATLAS (Active Theory Learning for Automated Science) is an active-learning framework rather than an LLM-orchestration system — distinguishing it from most of the catalogue. It runs iterative cycles of three components around a growing dataset:

Hypothesis Generator — trains a fresh set of Disentangled RNNs (DisRNNs) on the current dataset, sweeping the complexity-penalty parameter and random seeds. DisRNNs use information bottlenecks to converge on sparse, interpretable latent (“cognitive”) variables whose interactions form a hypothetical computational graph for the underlying mechanism. Varying the sparsity penalty maintains an ensemble of qualitatively distinct hypotheses, selected via an offset-softmax distribution over cross-validated likelihoods.
Experiment Optimizer — searches over experiment designs (binary T×A reward matrices) to maximize expected information gain, approximated as disagreement among ensemble members (analogous to the BALD objective). Optimization uses hill-climbing with 128 random restarts.
Experiment Runner — executes the optimized design against the ground-truth agent, appending the resulting trajectory to the dataset for the next cycle.

The design draws on Query-by-Committee active learning and optimal experimental design, but replaces a fixed model committee with structurally constrained neural networks whose internal diversity encodes distinct mechanistic hypotheses.

Validation

In-silico evaluation on recovering two reinforcement-learning agents — a Q-learning agent and a Leaky Actor-Critic agent — from their behavior in two-armed bandit tasks. Eight independent ATLAS runs of 100 cycles each were compared against two open-loop baselines (i.i.d. random rewards and Gaussian random-walk rewards) and against fixed expert-designed experiments drawn from the literature. Discovered models were scored on three criteria: behavioral similarity (held-out likelihood), structural similarity (isomorphism of the recovered computational graph to ground truth), and dynamical similarity (bidirectional bisimulation error). No wet-lab or real-world data — validation is benchmark-tier.

Notable results

5–10× improvement in sample efficiency across behavioral, structural, and computational-similarity metrics versus random experimentation.
Recovered the correct computational graph for 8/8 seeds after 100 experiments for both agents, where random and expert-designed baselines needed ~1,000 experiments (and the expert baseline never reached it for Actor-Critic).
Matched or surpassed expert-designed experiments, while producing designs with distinctive temporal structure unlike those researchers typically hand-craft; ablations show DisRNN ensembles yield more structured, informative designs than vanilla GRU ensembles.

Primary paper

Éltető, Daw, Stachenfeld, Miller, “ATLAS: Active Theory Learning for Automated Science,” arXiv:2606.12386.

Other references

None yet.

Code

Not released at preprint time.