Adventures in Reinforcement Learning

Just recently my son wanted to program a game for his school computing project.  After several rounds of discussion where I convinced him on the challenges of P2P socket programming along with game state management as well as the fact that he had to rely on himself to do most of the work (not me! 😂); that we settled on a simple turn based Othello game.

So what I did was to first code up the whole game myself to get a sense of the complexity as well as the programming topics I would need to teach my son about (e.g. data structures, game play, algorithm design, GUI event model etc …); so that I will be able to guide him as he codes up the game.

Simple turn based GUI for Othello game using Turtle graphics

I managed to code up a simple turn based game in Python using Turtle graphics.  It was a fun project and it allowed me to exercise my coding muscles.  But I thought to myself, can I bring this one step further?  It turns out, I can.

I took this opportunity to extend the project to training a reinforcement learning agent to play Othello.  Eventually, I was able to develop the following:

  1. Custom Gym environment based on the Othello game
  2. DQN agent with simple epsilon greedy policy and replay buffer
  3. Human and Agent game mode for Othello

Usually I would make this into a tutorial on how to build this but seeing that there are a lot of areas to cover, I decided to share the resources and steps that I used to develop this code and hopefully you can do the same as well.


Step 1: Learning about Reinforcement Learning

A great resource that I used to learn about Reinforcement Learning is the Udemy course – Practical AI with Python and Reinforcement Learning. (https://www.udemy.com/course/practical-ai-with-python-and-reinforcement-learning/)

This is a great course that will teach you the basics and theories about Reinforcement Learning together with relevant and useful coding exercises.  I recommend that you pay attention to the following chapters

  1. Reinforcement Learning – Core Concepts
  2. OpenAI Gym Overview
  3. Classical Q Learning
  4. Deep Q-Learning
  5. Creating Custom OpenAI Gym Environments

Step 2: Papers and existing work done with RL for Othello

I found the following papers helpful in deciding the architecture of the DQN agent as well as how the training is to be done.

Step 3: Learn from others

I also found learning from others who have coded similar projects useful to see how they implemented some of the concepts in the papers above.  One good repository is https://github.com/xyqfountain/Othello-Reversi-Env-for-Reinforcement-Learning


My current code is rather simple in that the training is based on Epsilon Greedy Policy together with a basic implementation of replay buffer.  However even with this basic setup, I am able to achieve between 70% to 80% win rate (across 30 games) against an opponent using a random play strategy.

Agent (White player) training in progress with visual game play
Agent (White player) training in progress with visual game play

You can download my code here – https://github.com/ianlokh/Othello.git which will allow you to train your own agent and even play against it. It is a fun way to assess how well you stack up against the agent.

I will be continually updating this code to explore other RL techniques so please come back for more updates.

That’s all for now and I hope you will find this useful in your RL learning journey!