Skip to content

Conversation

no4ni
Copy link

@no4ni no4ni commented Sep 3, 2025

This directory contains examples demonstrating next-token prediction using LLaMA? models through [llama.cpp/GGML]
The tool can be useful for checking and measuring fine tuning results on examples
(Now only on CPU)

Usage
prediction-next-token --model <model_path> --prompt [--hypothesis <first_word>]
or short form:
prediction-next-token -m <model_path> -p [-h <first_word>]

Example:
prediction-next-token -m "models\llama-3.2-1B-q4_k_m-128k.gguf" -p "Who invented E=mc^2?" -h "Einstein"

Notes for non-English UTF-8 text (e.g., Russian)
On Windows, it is recommended to use Windows Terminal:
.\prediction-next-token.exe -m "models\llama-3.2-1B-q4_k_m-128k-ru.gguf" -p "Здравствуйте!" -h "Привет"
chcp 65001

  • This ensures correct handling of UTF-8 characters both for input arguments and output in the console.

Notes on Model Behavior

  • The --hypothesis argument is optional and specifies expected/necessary the first word to evaluate.
  • After fine-tuning on a dataset, the perplexity of the model on a test set should decrease over training epochs.

@no4ni no4ni changed the title Example for show probability of next token example: add prediction-next-token command line argument handling Example for show probability of next token Sep 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant