⚠️ Version 2 available: Neural-Romance-v2
This project uses a Large Language Model (LLM) to generate a dataset used to train a Multi-Layer Perceptron (MLP) network.
This process is known as "Distilling" to transfer knowledge from a larger network to a smaller network.
Distilling the logic of an LLM down to an MLP provides complex decision-making at a fraction of the original compute cost.
- Download the LLM GGUF: download
- Launch llama.cpp server:
build/bin/llama-server --port 8081 -m Qwen3-30B-A3B-Instruct-2507-Q4_K_M.gguf - Launch the dataset generator on the same port:
php llm.php 8081 - Once it's generated enough lines for the dataset train the MLP:
python fit.py
llm.php- Uses llama.cpp to generate the training_data.txt.fit.py- Uses the training_data.txt to train an MLP Dense network using Tensorflow.
apt install php-cli php-curl
pip install numpy tensorflow
apt update
apt install -y build-essential git cmake ninja-build pkg-config vulkan-tools mesa-vulkan-drivers libvulkan-dev
apt install -y glslc glslang-tools spirv-tools vulkan-tools
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build -DGGML_VULKAN=ON -DLLAMA_CURL=OFF -DCMAKE_BUILD_TYPE=Release -G Ninja
cmake --build build -j
- Tensorflow generally trains small MLPs faster on a CPU than a GPU.
- LLM's via llama.cpp generally run faster on a GPU using the Vulkan backend.
- LLM used Qwen3-30B-A3B-Instruct-2507-Q4_K_M
- Trained models trained_models.tar.xz
- Prototype builds romance_dev
- Lowest loss model relu_sgd_256
- Demo available at romance.html (relu_adam_32)