Skip to content

Latest commit

 

History

History

nim-a2c

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 

Multi-agent A2C for Nim-21

Q-values heatmap for (S, A)-combinations.

Two agents that learn to play Nim-21 using PyTorch and A2C. Both agents learn the game-theory optimal strategy. The critic values show which states are winning and losing.