# Quantum error correction for the toric code using deep reinforcement learning. (arXiv:1811.12338v2 [quant-ph] UPDATED)

We implement a quantum error correction algorithm for bit-flip errors on the
topological toric code using deep reinforcement learning. An action-value
Q-function encodes the discounted value of moving a defect to a neighboring
site on the square grid (the action) depending on the full set of defects on
the torus (the syndrome or state). The Q-function is represented by a deep
convolutional neural network. Using the translational invariance on the torus
allows for viewing each defect from a central perspective which significantly
simplifies the state space representation independently of the number of defect
pairs. The training is done using experience replay, where data from the
algorithm being played out is stored and used for mini-batch upgrade of the
Q-network. We find performance which is close to, and for small error rates
asymptotically equivalent to, that achieved by the Minimum Weight Perfect
Matching algorithm for code distances up to $d=7$. Our results show that it is
possible for a self-trained agent without supervision or support algorithms to
find a decoding scheme that performs on par with hand-made algorithms, opening
up for future machine engineered decoders for more general error models and
error correcting codes.