Background readings

We are building on decades of work done by us and others. If you want to understand our work then we would recommend familiarizing yourself with the following prior work.

The OaK architecture
Richard S. Sutton
The Alberta plan for AI research
Richard S. Sutton, Michael Bowling, Patrick M. Pilarski
The big world hypothesis and its ramifications for AI
Khurram Javed, Richard S. Sutton
SwiftTD: A fast and robust algorithm for temporal difference learning
Khurram Javed, Arsalan Sharifnassab, Richard S. Sutton
Step-size optimization for continual learning
Thomas Degris, Khurram Javed, Arsalan Sharifnassab, Yuxin Liu, Richard S. Sutton
Reward-respecting subtasks for model-based reinforcement learning
Richard S. Sutton, Marlos C. Machado, G. Zacharias Holland, David Szepesvari, Finbarr Timbers, Brian Tanner, Adam White
The quest for a common model of the intelligent decision maker
Richard S. Sutton
Reward is enough
David Silver, Satinder Singh, Doina Precup, Richard S. Sutton
Planning with expectation models
Yi Wan, Zaheer Abbas, Adam White, Martha White, Richard S. Sutton
On the role of tracking in stationary environments
Richard S. Sutton, Anna Koop, David Silver
Scalable real-time recurrent learning using columnar-constructive networks
Khurram Javed, Haseeb Shah, Richard S. Sutton, Martha White