Computable AI

Keeping to the Narrow Path

Better imitation learning with self-correcting policies by negative sampling.

Daniel Cox • Sun 21 July 2019 in arXiv highlights •

Way Off-Policy Batch DRL

Pre-training using a generative model of pre-recorded trajectories and bias correction.

Daniel Cox • Sun 14 July 2019 in arXiv highlights •

A New Series arXiv Sampler

Beginning a new series highlighting a few interesting RL papers on the arXiv each week. This week: Simple curriculum learning, learning to interact with humans, and warm starting RL with propositional logic.

Daniel Cox • Sun 07 July 2019 in arXiv highlights •

Categories