DialectSense

End-to-end, fully automated dialect embedding + coarse-label classification pipeline:

Audio QC + preprocessing
 → WavLM-Large embeddings (chunked aggregation)
 → speaker-disjoint split (by uploader_id)
 → label coarsening (train-only label-centroid KMeans)
 → coarse-label training (stacked: Linear SVM + MLP → meta LogisticRegression)
 → evaluation + visualizations + report artifacts

Quickstart

make smoke

Artifacts are written to artifacts/<run_name>/ (for smoke: artifacts/smoke/).

Web UI

make ui CONFIG=configs/smoke.json

The UI includes a Realtime page: streaming capture of fixed-length chunks from the microphone, progressively outputting confidence line charts for all candidate clusters, facilitating real-time visualization.

Dependencies

Python 3.10+
make (GNU Make). If you don't have it, either install it (Linux: sudo apt-get install make) or run the CLI commands directly (see below / RUN_WINDOWS.md).
ffmpeg is required for .ogg decoding + silence trimming. The Makefile bootstraps a local static ffmpeg into .cache/ffmpeg/ if you don't have a system ffmpeg.
Python deps: make deps (handled automatically by make smoke / make ui)

Makefile Usage

The Makefile is the recommended “one-command” runner on Linux/macOS/WSL:

make smoke
make ui CONFIG=configs/smoke.json
make clean CONFIG=configs/smoke.json

You can switch configs via CONFIG=...:

make smoke CONFIG=configs/smoke.json
make preprocess embed split coarsen train eval report CONFIG=configs/full.json

configs/full.json by default uses a stacked coarse classifier (SVM + MLP → meta LR) to improve Accuracy. If you only want to retrain the model and evaluate:

make train eval report CONFIG=configs/full.json

If your environment does not have make (common on Windows), follow RUN_WINDOWS.md and run the Python CLI commands instead.

Outputs

After make smoke, look at:

artifacts/smoke/audio_qc.csv (per-clip preprocessing/QC decisions)
artifacts/smoke/splits.csv (speaker-disjoint train/val/test)
artifacts/smoke/label_to_cluster.json + artifacts/smoke/cluster_summary.md (coarse mapping)
artifacts/smoke/models/coarse_model.joblib (trained coarse classifier)
artifacts/smoke/report_coarse.json + artifacts/smoke/top_confusions.csv (metrics + confusions)
artifacts/smoke/figures/ (PNG plots: UMAP/t-SNE, confusion matrix, QC plots, etc.)

CLI

Each stage is runnable independently (and reuses cached artifacts when present):

.venv/bin/python -m dialectsense.cli preprocess --config configs/smoke.json
.venv/bin/python -m dialectsense.cli embed      --config configs/smoke.json
.venv/bin/python -m dialectsense.cli split      --config configs/smoke.json
.venv/bin/python -m dialectsense.cli coarsen    --config configs/smoke.json
.venv/bin/python -m dialectsense.cli train      --config configs/smoke.json
.venv/bin/python -m dialectsense.cli eval       --config configs/smoke.json
.venv/bin/python -m dialectsense.cli report     --config configs/smoke.json
.venv/bin/python -m dialectsense.cli ui         --config configs/smoke.json

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
configs		configs
dialectsense		dialectsense
latex		latex
.gitignore		.gitignore
DESIGN.md		DESIGN.md
Makefile		Makefile
README.md		README.md
RUN_WINDOWS.md		RUN_WINDOWS.md
bootstrap_ffmpeg.py		bootstrap_ffmpeg.py
build-openwork.ps1		build-openwork.ps1
embedding.py		embedding.py
requirements.txt		requirements.txt
tmp_test.wav		tmp_test.wav
ui_test.log		ui_test.log

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DialectSense

Quickstart

Web UI

Dependencies

Makefile Usage

Outputs

CLI

About

Uh oh!

Releases

Packages

Languages

Blue16-WangFudi/DialectSense

Folders and files

Latest commit

History

Repository files navigation

DialectSense

Quickstart

Web UI

Dependencies

Makefile Usage

Outputs

CLI

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages