Computable AI

Cox's Theorem: Establishing Probability Theory

Cox's theorem is the strongest argument for the use of standard probability theory. Here we examine the axioms to establish a firm foundation for the interpretation of probability theory as the unique extension of true-false logic to degrees of belief.

Daniel Cox • Sun 03 November 2019 in arXiv highlights •

Comments on Eight Abstracts

An unfocused sweep of eight abstracts from a very busy week in AI research: Emergent tool use, why hierarchical learning can work so well, brain-inspired hardware for artificial neural networks, pretraining and transfer learning for RL, chromatic network compression, semi-supervised reward shaping, WGAN model imitation for model-based RL, and navigation in turbulent flows!

Daniel Cox • Sun 06 October 2019 in arXiv highlights •

Active Perception in Adversarial Scenarios

Accumulating evidence about peers to discriminate potential threats.

Daniel Cox • Sun 22 September 2019 in arXiv highlights •

Discovery of Useful Questions as Auxiliary Tasks

Learning more like a human, and more like a scientist, by actively seeking useful auxiliary questions during learning.

Daniel Cox • Sun 15 September 2019 in arXiv highlights •

Deep Reinforcement Learning without Catastrophic Forgetting

Long-term learning of multiple tasks without forgetting old skills, using a new technique called Pseudo-Rehearsal.

Daniel Cox • Mon 09 September 2019 in arXiv highlights •

Reward tampering

Improving safety and control by preventing all manner of reward tampering by the agent itself.

Daniel Cox • Sun 25 August 2019 in arXiv highlights •

DRL Not Superhuman on Atari

DRL may not be superhuman on Atari after all, and how to avoid making mistakes like that in the future.

Daniel Cox • Sun 18 August 2019 in arXiv highlights •

Three Method Comparison for Traffic Signal Control

Comparing supervised learning, random search, and deep reinforcement learning on traffic signal control.

Daniel Cox • Sun 11 August 2019 in arXiv highlights •

Learning Compound and Composable Policies

Straightforward hierarchical RL for concurrent discovery of sub-policies and their controller.

Daniel Cox • Sun 04 August 2019 in arXiv highlights •

Efficient exploration with self-imitation learning

I wonder if that happens every time...

Daniel Cox • Sun 28 July 2019 in arXiv highlights •

Categories