Luna Chess is a single-thread, single-GPU chess engine rated around 1850, trained entirely through self-play without any human knowledge except the rules of the game. It uses deep reinforcement learning and Monte Carlo Tree Search (MCTS) to develop its playing strategy.
The neural network at the heart of Luna Chess is parameterized by theta (θ) and takes the state of the chess board as input. It produces two outputs:
- A continuous value evaluation v∈[-1,1] of the board position from the current player's perspective
- A policy distribution p that represents probabilities over all possible actions
During training, the network learns from examples of the form (s_t, π_t, z_t), where:
- s_t is the state
- π_t is an estimate of the probability from state s_t
- z_t is the final game outcome ∈ [-1,1]
/
├── LICENSE
├── README.md
├── makefile
├── poetry.lock
├── pyproject.toml
├── requirements.txt
├── runs/ # Training runs data
├── src/
│ ├── index.html # Web interface main page
│ ├── luna/ # Core engine components
│ │ ├── NNet.py # Neural Network wrapper
│ │ ├── coach.py # Training orchestration
│ │ ├── eval.py # Evaluation utilities
│ │ ├── game/ # Game mechanics
│ │ │ ├── arena.py # Self-play arena
│ │ │ ├── luna_game.py # Chess game logic
│ │ │ ├── player.py # Player implementations
│ │ │ └── state.py # Game state representation
│ │ ├── luna.py # Main interface for the engine
│ │ ├── luna_NN.py # Neural network architecture
│ │ ├── mcts.py # Monte Carlo Tree Search
│ │ └── utils.py # Utility functions
│ ├── luna_html_wrapper.py # Web interface backend
│ ├── main.py # Main training entry point
│ ├── playground.py # Development playground
│ └── static/ # Web assets
│ ├── chessboard.min.css
│ ├── chessboard.min.js
│ ├── img/
│ └── jquery.min.js
└── temp/ # Checkpoint storage
-
Install dependencies:
pip install -r requirements.txt
Key dependencies include:
- Python 3.7+
- PyTorch
- chess
- numpy
- Flask (for web interface)
- stockfish (optional, for comparison)
-
(Optional) If you want to use GPU training, ensure you have CUDA installed and compatible with your PyTorch version.
Training is managed by the Coach
class, which handles self-play, learning, and model evaluation.
-
Start training from scratch:
python src/main.py
-
Resume training from a checkpoint:
python src/main.py --load_model=True
Edit the args
dictionary in src/main.py
to adjust training settings:
args = dotdict({
'numIters': 5, # Number of training iterations
'numEps': 10, # Number of self-play games per iteration
'tempThreshold': 10, # Temperature threshold
'updateThreshold': 0.6,# Required win rate for new model acceptance
'maxlenOfQueue': 20000,# Maximum number of game examples to store
'numMCTSSims': 10, # Number of MCTS simulations per move
'arenaCompare': 10, # Number of games to compare new vs old model
'cpuct': 1, # Exploration constant in MCTS
'checkpoint': './temp/',# Directory to save checkpoints
'load_model': False, # Whether to load existing model
'load_examples': True, # Whether to load saved examples
'load_folder_file': ('./pretrained_models/','best.pth.tar'),
'numItersForTrainExamplesHistory': 5,
'dir_noise': True, # Add Dirichlet noise for exploration
'dir_alpha': 1.4, # Dirichlet alpha parameter
'save_anyway': True # Always save model
})
Training will produce checkpoint files in the ./temp/
directory, with the best model saved as best.pth.tar
.
Luna Chess includes a web interface to play against the trained model.
-
Start the web server:
python src/luna_html_wrapper.py
-
Open your browser and navigate to:
http://127.0.0.1:5000/
-
The web interface provides three modes:
- Play as White: You play as white, Luna plays as black
- Play as Black: Luna plays as white, you play as black
- Self-Play: Watch Luna play against itself (access via
/selfplay
)
- Play as White/Black: Choose your color and play against Luna
- Reset Game: Start a new game
- Self-Play Mode: Watch Luna play against itself by navigating to
/selfplay
In self-play mode:
- Start Self Play: Begin a game with Luna playing against itself
- Stop: Pause the self-play demonstration
- Reset Board: Start over with a fresh board
Luna can be pitted against Stockfish for evaluation:
from luna.game.player import StockFishPlayer
from luna.luna import Luna
# Initialize Luna
luna = Luna()
# Initialize Stockfish (adjust parameters as needed)
stockfish = StockFishPlayer(elo=1800, skill_level=10, depth=10)
# Set up comparison and play
# See luna/game/arena.py for implementation details
Luna's neural network uses a 3D convolutional architecture:
- Input: 8x8x6 (board position serialized)
- Conv3D layers with batch normalization
- Several fully connected layers
- Outputs:
- Policy head: probability distribution over moves
- Value head: scalar evaluation of position
The network is trained to minimize the loss function:
l = ∑(vθ(st)-zt)² - →πt⋅log(→pθ(st))
The MCTS implementation follows these steps:
- Initialize a search tree with the current position as root
- In each simulation:
- Select moves that maximize upper confidence bound
- Expand the tree with a new node when a leaf is reached
- Evaluate the position using the neural network
- Backpropagate the evaluation up the search path
- After simulations, choose a move based on visit counts
luna.py
: Main interface to the chess engineluna_NN.py
: Neural network architectureNNet.py
: Neural network training and inference wrappermcts.py
: Monte Carlo Tree Search implementationcoach.py
: Self-play training orchestrationluna_game.py
: Chess game environmentluna_html_wrapper.py
: Web interface for human play
- Missing endpoints in Self-Play mode: If you see a "Not found" error for
/next_move
, ensure you've updated yourluna_html_wrapper.py
with the latest self-play implementation. - Board orientation issues: The board may reset orientation after a game ends. This is a UI issue that can be fixed in the JavaScript code.
- CUDA out of memory: Reduce batch size or number of MCTS simulations if you encounter memory issues during training.
- Training plateau: If performance plateaus, try increasing Dirichlet noise (
dir_alpha
) for more exploration or adjust the learning rate.
- Enhanced opening book integration
- Time management for competitive play
- Multi-GPU training support
- Endgame tablebases integration
- Progressive network pruning for speed optimization
Luna Chess is provided under an open-source license. The project draws inspiration from AlphaZero's approach to chess learning through pure self-play.