Skip to content

Conversation

@bghira
Copy link
Owner

@bghira bghira commented Oct 6, 2025

this has been in progress for a few weeks, but already allows end-to-end configuration and training.

validations are a bit weird, there's a bunch of room for improvement, but it's far enough along that it can be used somewhat seriously at this point.

please report any issues you might have.

to get started, you'll have to install this branch by git clone andpip install -e . or, direct via ssh with git. i'd recommend cloning it.

simpletuner server

once it's up, you can browse to http://localhost:8001/web

when you get to the interface, assuming it all works, you'll see a couple configuration questions that you'll need to answer about where files should be stored. these configs are stored in ~/.simpletuner on Mac and Linux both, in case you have to remove or manually fix them.

if you want to reuse your simpletuner env style configs, move them to a new dir.

  • create something like ~/simpletuner
  • move your config/ folder to ~/simpletuner/config
  • move your output/ folder to ~/simpletuner/output
  • use these two paths during onboarding

If your old paths were using relative paths like foo/config.json they'll all be detected just fine in their new location. you'll have to update the paths if you're using absolute locations.

it doesn't have any kind of security layer at the moment, but it protects you from at least path traversals. so it's not recommended to run this on a public IP unless you like playing roulette.

errata:

some known bugs exist.
no webUI documentation is written.
the user interface notifications, layout, colour scheme, and timing can be off.
the dataset builder interface has a lot of features but currently, using it kinda sucks.
right now there's no way to reset the onboarding setup config (the one in ~/.simpletuner) so you have to manually edit that, but it's almost never needed.

the main bug currently that you'll encounter is that validation images get incorrectly copied to newer checkpoint folders if you dont' have the step interval lined up perfectly eg 50 steps for checkpointing AND 50 step interval for validating.

the other is that the event viewer interface tends to just redraw the entire fkn thing for no reason.

the remaining issue blocking merge is simply usability tweaks and documentation.

configuring the model path, flavour, etc

image

managing checkpoints

image

bghira added 30 commits October 6, 2025 01:24
bghira and others added 27 commits October 16, 2025 09:45
Added detailed onboarding steps and configuration guidelines for new users of the WebUI, including dataset management and training environment setup.
Updated DeepSpeed integration details in README.
Updated tutorial references and quick start instructions in README.
@bghira bghira merged commit 0ea7b81 into main Oct 17, 2025
1 check passed
@bghira bghira removed the work-in-progress This issue relates to some currently in-progress work. label Oct 17, 2025
@bghira bghira deleted the feature/webui-phase-one branch October 21, 2025 17:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

difficult-feature documentation Improvements or additions to documentation enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants