
Inspecting the gradients of entropy-augmented policy updates to show their equivalence
Inspecting the gradients of entropy-augmented policy updates to show their equivalence
Expanding DQN to produce estimates of return distributions, and an exploration into why this helps learning
The purpose statement and introduction to Computable AI.