Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
2f0d2f8
Add full_distill_cot results
fxlrnrpt Sep 24, 2025
a90db2c
Add first stage of pipeline
fxlrnrpt Sep 24, 2025
1307de3
Add pt2 of the pipeline
fxlrnrpt Sep 24, 2025
4e5e461
Add alternative baseline
fxlrnrpt Sep 24, 2025
ae21fb5
Upd readme
fxlrnrpt Sep 24, 2025
a6f5717
Fix typo
fxlrnrpt Sep 24, 2025
573f17a
Fix answer indexing
fxlrnrpt Sep 25, 2025
1ad6284
Fix truncation
fxlrnrpt Sep 26, 2025
139fc4e
Fix full distill baseline config
fxlrnrpt Sep 27, 2025
5849714
Change eval test period for alternative
fxlrnrpt Sep 27, 2025
707ff0c
Add phi pt1 results
DanielVyazhev Oct 1, 2025
0b3e7f3
Add phi pt2 results
DanielVyazhev Oct 1, 2025
17c1e41
Add qwen pt1 results
DanielVyazhev Oct 1, 2025
2d91957
Add qwen pt2 results
DanielVyazhev Oct 1, 2025
706b05c
Add pipeline results
fxlrnrpt Oct 1, 2025
204ff5c
Add analysis
fxlrnrpt Oct 1, 2025
f3daa44
Support list of eval epochs
fxlrnrpt Oct 1, 2025
0c34013
new train results
DanielVyazhev Oct 2, 2025
3209789
Add files via upload
DanielVyazhev Oct 2, 2025
3f56660
Delete artifacts/pipeline_20epochs/pipeline/qwen/pt2/training_98392_r…
DanielVyazhev Oct 2, 2025
25cded0
Add files via upload
DanielVyazhev Oct 2, 2025
695dc86
Add files via upload
DanielVyazhev Oct 2, 2025
8ab6bb0
Delete artifacts/pipeline_20epochs/pipeline/qwen/pt1/training_37578_r…
DanielVyazhev Oct 2, 2025
1a16c2c
Add files via upload
DanielVyazhev Oct 2, 2025
c927d05
Add files via upload
DanielVyazhev Oct 2, 2025
6e847e2
Delete artifacts/pipeline_20epochs/pipeline/qwen/pt2/training_37578_r…
DanielVyazhev Oct 2, 2025
888fe2e
Add files via upload
DanielVyazhev Oct 2, 2025
e310d19
Add files via upload
DanielVyazhev Oct 3, 2025
361da5b
Add files via upload
DanielVyazhev Oct 3, 2025
907b8ca
Create cross_entropy pipeline
DanielVyazhev Oct 3, 2025
e64a713
Create cross_entropy alternative
DanielVyazhev Oct 3, 2025
7e744d7
Create sft cross-entropy
DanielVyazhev Oct 3, 2025
9487159
Create cross-entropy curriculum
DanielVyazhev Oct 3, 2025
bed71aa
Data cleanup
fxlrnrpt Oct 3, 2025
1179f85
Data cleanup
fxlrnrpt Oct 3, 2025
4084578
Data cleanup
fxlrnrpt Oct 3, 2025
e502d06
Create cross_entropy alternative qwen
DanielVyazhev Oct 3, 2025
380de8f
Add ablations
fxlrnrpt Oct 3, 2025
393eaa9
Create cross_entropy pipeline qwen
DanielVyazhev Oct 3, 2025
9d6960e
Create sft cross-entropy qwen
DanielVyazhev Oct 3, 2025
30babc9
Create cross-entropy curriculum qwen
DanielVyazhev Oct 3, 2025
f1852e5
Cleanup
fxlrnrpt Oct 3, 2025
26e2fc3
Add easy_metrics.jsonl qwen
DanielVyazhev Oct 4, 2025
895ca6b
Add files via upload
DanielVyazhev Oct 4, 2025
2070ae3
Add easy_metrics.jsonl phi4
DanielVyazhev Oct 4, 2025
3619e61
Add files via upload
DanielVyazhev Oct 4, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,5 +6,7 @@ __pycache__
*.parquet
artifacts/**
!artifacts/**/
!artifacts/pipeline_20epochs/sft_baseline
!artifacts/pipeline_20epochs/sft_baseline/**/*
!artifacts/pipeline_20epochs/full_distill_baseline/**/*
!artifacts/pipeline_20epochs/**/trainer_state.json
!artifacts/pipeline_20epochs/pipeline/**/**/*.jsonl
15 changes: 10 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# Code for "Complexity-aware fine-tuning" paper

General purpose Large Language Models (LLMs) are frequently fine-tuned to improve performance in niche domains. Although fine-tuning is a standard practice, we still lack a deep understanding of how to aggregate data for better results. In this work, we show that the entropy-based output estimation provides a meaningful guideline for fine-tuning data preparation. Specifically, across two small open models ~3B$ we find that a single token answer entropy shows ROC AUC score of ~0.73 and allows us to split the training data into three complexity categories. Moreover, we discover that these categories require different tuning mechanisms. Leveraging these insights, we propose a novel blueprint for efficient fine-tuning that outperforms the standard approach (TODO vs TODO accuracy). We also provide an in-depth analysis of alternative complexity estimation techniques based on expert assessment via model-as-judge (MASJ), entropy aggregation, and reasoning metadata with ROC AUC scores of 0.57, TODO and TODO accordingly. Our findings facilitate immediate enhancements in fine-tuning performance. In addition, we path the way to further investigation and immersion of the numerical complexity analysis.
General-purpose Large Language Models (LLMs) are frequently fine-tuned through supervised fine-tuning (SFT) to enhance performance in specific domains. Better results can be achieved by distilling the chain-of-thought of a larger model at the cost of numerous expensive calls and a much greater amount of data.
We propose a novel blueprint for efficient fine-tuning that uses reasoning only for complex data identified by entropy. Specifically, across two small open models ($\approx 3B$) we split the training data into complexity categories by a single token answer entropy (ROC AUC $0.73$), fine-tune large language models (LLMs) via SFT and distillation, and show that our pipeline significantly outperforms the standard SFT approach ($0.55$ vs $0.43$ average accuracy) and provides comparable with distillation performance while using $62\%$ less data ($0.55$ average accuracy for both).

## Prerequisites

Expand All @@ -16,10 +17,14 @@ Other datasets are included in the repo and also published on Huggingface:
- [MMLU Pro reasoning score](https://huggingface.co/datasets/LabARSS/MMLU-Pro-reasoning-score)
- [MMLU Pro single token entropy](https://huggingface.co/datasets/LabARSS/MMLU-Pro-single-token-entropy)

## Running experiments
## Training pipeline

`uv run src/experiments/REPLACE_ME.py`
1. Main training pipeline - `src/experiments/pipeline/pipeline`
2. Alternative baseline - `src/experiments/pipeline/alternative_baseline`
3. Full distillation baseline - `src/experiments/pipeline/full_distill`
3. Full SFT baseline - `src/experiments/pipeline/sft_baseline`
3. Curriculum SFT baseline - `src/experiments/pipeline/sft_curriculum_baseline`

## Cite
## Running experiments

TODO
`uv run src/experiments/REPLACE_ME.py`
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
{"epoch": 1, "accuracy": 0.46, "desc": "test_balanced"}
{"epoch": 2, "accuracy": 0.46, "desc": "test_balanced"}
{"epoch": 3, "accuracy": null, "desc": "test_balanced"}
{"epoch": 4, "accuracy": 0.47333333333333333, "desc": "test_balanced"}
{"epoch": 5, "accuracy": null, "desc": "test_balanced"}
{"epoch": 6, "accuracy": 0.44, "desc": "test_balanced"}
{"epoch": 7, "accuracy": null, "desc": "test_balanced"}
{"epoch": 8, "accuracy": 0.45666666666666667, "desc": "test_balanced"}
{"epoch": 9, "accuracy": null, "desc": "test_balanced"}
{"epoch": 10, "accuracy": 0.48, "desc": "test_balanced"}
{"epoch": 11, "accuracy": null, "desc": "test_balanced"}
{"epoch": 12, "accuracy": 0.6166666666666667, "desc": "test_balanced"}
{"epoch": 13, "accuracy": null, "desc": "test_balanced"}
{"epoch": 14, "accuracy": 0.5866666666666667, "desc": "test_balanced"}
{"epoch": 15, "accuracy": null, "desc": "test_balanced"}
{"epoch": 16, "accuracy": 0.6366666666666667, "desc": "test_balanced"}
{"epoch": 17, "accuracy": null, "desc": "test_balanced"}
{"epoch": 18, "accuracy": 0.6066666666666667, "desc": "test_balanced"}
{"epoch": 19, "accuracy": null, "desc": "test_balanced"}
{"epoch": 20, "accuracy": 0.5866666666666667, "desc": "test_balanced"}
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
{"epoch": 1, "accuracy": null, "desc": "test_balanced"}
{"epoch": 2, "accuracy": 0.26, "desc": "test_balanced"}
{"epoch": 3, "accuracy": null, "desc": "test_balanced"}
{"epoch": 4, "accuracy": 0.35333333333333333, "desc": "test_balanced"}
{"epoch": 5, "accuracy": null, "desc": "test_balanced"}
{"epoch": 6, "accuracy": 0.38333333333333336, "desc": "test_balanced"}
{"epoch": 7, "accuracy": null, "desc": "test_balanced"}
{"epoch": 8, "accuracy": 0.39, "desc": "test_balanced"}
{"epoch": 9, "accuracy": null, "desc": "test_balanced"}
{"epoch": 10, "accuracy": 0.3466666666666667, "desc": "test_balanced"}
{"epoch": 11, "accuracy": null, "desc": "test_balanced"}
{"epoch": 12, "accuracy": 0.4633333333333333, "desc": "test_balanced"}
{"epoch": 13, "accuracy": null, "desc": "test_balanced"}
{"epoch": 14, "accuracy": 0.47333333333333333, "desc": "test_balanced"}
{"epoch": 15, "accuracy": null, "desc": "test_balanced"}
{"epoch": 16, "accuracy": 0.49, "desc": "test_balanced"}
{"epoch": 17, "accuracy": null, "desc": "test_balanced"}
{"epoch": 18, "accuracy": 0.47, "desc": "test_balanced"}
{"epoch": 19, "accuracy": null, "desc": "test_balanced"}
{"epoch": 20, "accuracy": 0.5066666666666667, "desc": "test_balanced"}
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
{"epoch": 1, "accuracy": 0.4379, "desc": "test_balanced"}
{"epoch": 2, "accuracy": 0.4539, "desc": "test_balanced"}
{"epoch": 3, "accuracy": 0.44770000000000004, "desc": "test_balanced"}
{"epoch": 4, "accuracy": 0.45020000000000004, "desc": "test_balanced"}
{"epoch": 5, "accuracy": 0.4539, "desc": "test_balanced"}
{"epoch": 6, "accuracy": 0.46, "desc": "test_balanced"}
{"epoch": 7, "accuracy": 0.46130000000000004, "desc": "test_balanced"}
{"epoch": 8, "accuracy": 0.45880000000000004, "desc": "test_balanced"}
{"epoch": 9, "accuracy": 0.4576, "desc": "test_balanced"}
{"epoch": 10, "accuracy": 0.45880000000000004, "desc": "test_balanced"}
{"epoch": 11, "accuracy": 0.45630000000000004, "desc": "test_balanced"}
{"epoch": 12, "accuracy": 0.4539, "desc": "test_balanced"}
{"epoch": 13, "accuracy": 0.4465, "desc": "test_balanced"}
{"epoch": 14, "accuracy": 0.46130000000000004, "desc": "test_balanced"}
{"epoch": 15, "accuracy": 0.45020000000000004, "desc": "test_balanced"}
{"epoch": 16, "accuracy": 0.449, "desc": "test_balanced"}
{"epoch": 17, "accuracy": 0.4465, "desc": "test_balanced"}
{"epoch": 18, "accuracy": 0.4403, "desc": "test_balanced"}
{"epoch": 19, "accuracy": 0.4526, "desc": "test_balanced"}
{"epoch": 20, "accuracy": 0.4416, "desc": "test_balanced"}
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
{"epoch": 1, "accuracy": 0.2829, "desc": "test_balanced"}
{"epoch": 2, "accuracy": 0.23989999999999997, "desc": "test_balanced"}
{"epoch": 3, "accuracy": 0.2546, "desc": "test_balanced"}
{"epoch": 4, "accuracy": 0.2509, "desc": "test_balanced"}
{"epoch": 5, "accuracy": 0.2706, "desc": "test_balanced"}
{"epoch": 6, "accuracy": 0.30010000000000003, "desc": "test_balanced"}
{"epoch": 7, "accuracy": 0.321, "desc": "test_balanced"}
{"epoch": 8, "accuracy": 0.3321, "desc": "test_balanced"}
{"epoch": 9, "accuracy": 0.3333, "desc": "test_balanced"}
{"epoch": 10, "accuracy": 0.3235, "desc": "test_balanced"}
{"epoch": 11, "accuracy": 0.3272, "desc": "test_balanced"}
{"epoch": 12, "accuracy": 0.3235, "desc": "test_balanced"}
{"epoch": 13, "accuracy": 0.31, "desc": "test_balanced"}
{"epoch": 14, "accuracy": 0.3173, "desc": "test_balanced"}
{"epoch": 15, "accuracy": 0.3124, "desc": "test_balanced"}
{"epoch": 16, "accuracy": 0.294, "desc": "test_balanced"}
{"epoch": 17, "accuracy": 0.2768, "desc": "test_balanced"}
{"epoch": 18, "accuracy": 0.2915, "desc": "test_balanced"}
{"epoch": 19, "accuracy": 0.2534, "desc": "test_balanced"}
{"epoch": 20, "accuracy": 0.2743, "desc": "test_balanced"}
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
{"epoch": 1, "accuracy": 0.3444, "desc": "test_balanced"}
{"epoch": 2, "accuracy": 0.385, "desc": "test_balanced"}
{"epoch": 3, "accuracy": 0.4133, "desc": "test_balanced"}
{"epoch": 4, "accuracy": 0.4108, "desc": "test_balanced"}
{"epoch": 5, "accuracy": 0.4194, "desc": "test_balanced"}
{"epoch": 6, "accuracy": 0.4268, "desc": "test_balanced"}
{"epoch": 7, "accuracy": 0.4244, "desc": "test_balanced"}
{"epoch": 8, "accuracy": 0.4379, "desc": "test_balanced"}
{"epoch": 9, "accuracy": 0.43420000000000003, "desc": "test_balanced"}
{"epoch": 10, "accuracy": 0.4391, "desc": "test_balanced"}
{"epoch": 11, "accuracy": 0.4293, "desc": "test_balanced"}
{"epoch": 12, "accuracy": 0.4379, "desc": "test_balanced"}
{"epoch": 13, "accuracy": 0.4379, "desc": "test_balanced"}
{"epoch": 14, "accuracy": 0.43170000000000003, "desc": "test_balanced"}
{"epoch": 15, "accuracy": 0.4268, "desc": "test_balanced"}
{"epoch": 16, "accuracy": 0.43170000000000003, "desc": "test_balanced"}
{"epoch": 17, "accuracy": 0.4071, "desc": "test_balanced"}
{"epoch": 18, "accuracy": 0.41700000000000004, "desc": "test_balanced"}
{"epoch": 19, "accuracy": 0.4022, "desc": "test_balanced"}
{"epoch": 20, "accuracy": 0.38380000000000003, "desc": "test_balanced"}
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
{"epoch": 1, "accuracy": 0.3749, "desc": "test_balanced"}
{"epoch": 2, "accuracy": 0.3715, "desc": "test_balanced"}
{"epoch": 3, "accuracy": 0.4052, "desc": "test_balanced"}
{"epoch": 4, "accuracy": 0.39390000000000003, "desc": "test_balanced"}
{"epoch": 5, "accuracy": 0.3883, "desc": "test_balanced"}
{"epoch": 6, "accuracy": 0.39280000000000004, "desc": "test_balanced"}
{"epoch": 7, "accuracy": 0.3951, "desc": "test_balanced"}
{"epoch": 8, "accuracy": 0.4007, "desc": "test_balanced"}
{"epoch": 9, "accuracy": 0.3895, "desc": "test_balanced"}
{"epoch": 10, "accuracy": 0.3872, "desc": "test_balanced"}
{"epoch": 11, "accuracy": 0.3872, "desc": "test_balanced"}
{"epoch": 12, "accuracy": 0.3749, "desc": "test_balanced"}
{"epoch": 13, "accuracy": 0.38159999999999994, "desc": "test_balanced"}
{"epoch": 14, "accuracy": 0.3872, "desc": "test_balanced"}
{"epoch": 15, "accuracy": 0.36590000000000006, "desc": "test_balanced"}
{"epoch": 16, "accuracy": 0.36700000000000005, "desc": "test_balanced"}
{"epoch": 17, "accuracy": 0.3558, "desc": "test_balanced"}
{"epoch": 18, "accuracy": 0.36479999999999996, "desc": "test_balanced"}
{"epoch": 19, "accuracy": 0.3715, "desc": "test_balanced"}
{"epoch": 20, "accuracy": 0.3625, "desc": "test_balanced"}
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
{"epoch": 1, "accuracy": 0.1437, "desc": "test_balanced"}
{"epoch": 2, "accuracy": 0.2065, "desc": "test_balanced"}
{"epoch": 3, "accuracy": 0.248, "desc": "test_balanced"}
{"epoch": 4, "accuracy": 0.2705, "desc": "test_balanced"}
{"epoch": 5, "accuracy": 0.2682, "desc": "test_balanced"}
{"epoch": 6, "accuracy": 0.2626, "desc": "test_balanced"}
{"epoch": 7, "accuracy": 0.3098, "desc": "test_balanced"}
{"epoch": 8, "accuracy": 0.2896, "desc": "test_balanced"}
{"epoch": 9, "accuracy": 0.2907, "desc": "test_balanced"}
{"epoch": 10, "accuracy": 0.29410000000000003, "desc": "test_balanced"}
{"epoch": 11, "accuracy": 0.2727, "desc": "test_balanced"}
{"epoch": 12, "accuracy": 0.2851, "desc": "test_balanced"}
{"epoch": 13, "accuracy": 0.28059999999999996, "desc": "test_balanced"}
{"epoch": 14, "accuracy": 0.2963, "desc": "test_balanced"}
{"epoch": 15, "accuracy": 0.27949999999999997, "desc": "test_balanced"}
{"epoch": 16, "accuracy": 0.3019, "desc": "test_balanced"}
{"epoch": 17, "accuracy": 0.2738, "desc": "test_balanced"}
{"epoch": 18, "accuracy": 0.2952, "desc": "test_balanced"}
{"epoch": 19, "accuracy": 0.2828, "desc": "test_balanced"}
{"epoch": 20, "accuracy": 0.2783, "desc": "test_balanced"}
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
{"epoch": 1, "accuracy": 0.2379, "desc": "test_balanced"}
{"epoch": 2, "accuracy": 0.26489999999999997, "desc": "test_balanced"}
{"epoch": 3, "accuracy": 0.3221, "desc": "test_balanced"}
{"epoch": 4, "accuracy": 0.349, "desc": "test_balanced"}
{"epoch": 5, "accuracy": 0.3479, "desc": "test_balanced"}
{"epoch": 6, "accuracy": 0.3715, "desc": "test_balanced"}
{"epoch": 7, "accuracy": 0.3917, "desc": "test_balanced"}
{"epoch": 8, "accuracy": 0.3603, "desc": "test_balanced"}
{"epoch": 9, "accuracy": 0.376, "desc": "test_balanced"}
{"epoch": 10, "accuracy": 0.3636, "desc": "test_balanced"}
{"epoch": 11, "accuracy": 0.3704, "desc": "test_balanced"}
{"epoch": 12, "accuracy": 0.3625, "desc": "test_balanced"}
{"epoch": 13, "accuracy": 0.3591, "desc": "test_balanced"}
{"epoch": 14, "accuracy": 0.358, "desc": "test_balanced"}
{"epoch": 15, "accuracy": 0.3558, "desc": "test_balanced"}
{"epoch": 16, "accuracy": 0.3378, "desc": "test_balanced"}
{"epoch": 17, "accuracy": 0.3457, "desc": "test_balanced"}
{"epoch": 18, "accuracy": 0.3558, "desc": "test_balanced"}
{"epoch": 19, "accuracy": 0.3367, "desc": "test_balanced"}
{"epoch": 20, "accuracy": 0.3389, "desc": "test_balanced"}
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
{"epoch": 1, "accuracy": 0.4766666666666667, "desc": "test_balanced"}
{"epoch": 2, "accuracy": 0.5066666666666667, "desc": "test_balanced"}
{"epoch": 3, "accuracy": 0.51, "desc": "test_balanced"}
{"epoch": 4, "accuracy": 0.5133333333333333, "desc": "test_balanced"}
{"epoch": 5, "accuracy": 0.5233333333333333, "desc": "test_balanced"}
{"epoch": 6, "accuracy": 0.5366666666666666, "desc": "test_balanced"}
{"epoch": 7, "accuracy": 0.5333333333333333, "desc": "test_balanced"}
{"epoch": 8, "accuracy": 0.51, "desc": "test_balanced"}
{"epoch": 9, "accuracy": 0.5233333333333333, "desc": "test_balanced"}
{"epoch": 10, "accuracy": 0.5133333333333333, "desc": "test_balanced"}
{"epoch": 11, "accuracy": 0.5533333333333333, "desc": "test_balanced"}
{"epoch": 12, "accuracy": 0.5766666666666667, "desc": "test_balanced"}
{"epoch": 13, "accuracy": 0.61, "desc": "test_balanced"}
{"epoch": 14, "accuracy": 0.6133333333333333, "desc": "test_balanced"}
{"epoch": 15, "accuracy": 0.58, "desc": "test_balanced"}
{"epoch": 16, "accuracy": 0.5866666666666667, "desc": "test_balanced"}
{"epoch": 17, "accuracy": 0.6, "desc": "test_balanced"}
{"epoch": 18, "accuracy": 0.6, "desc": "test_balanced"}
{"epoch": 19, "accuracy": 0.5866666666666667, "desc": "test_balanced"}
{"epoch": 20, "accuracy": 0.61, "desc": "test_balanced"}
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
{"epoch": 1, "accuracy": 0.3566666666666667, "desc": "test_balanced"}
{"epoch": 2, "accuracy": 0.4, "desc": "test_balanced"}
{"epoch": 3, "accuracy": 0.37333333333333335, "desc": "test_balanced"}
{"epoch": 4, "accuracy": 0.41, "desc": "test_balanced"}
{"epoch": 5, "accuracy": 0.41333333333333333, "desc": "test_balanced"}
{"epoch": 6, "accuracy": 0.42, "desc": "test_balanced"}
{"epoch": 7, "accuracy": 0.44, "desc": "test_balanced"}
{"epoch": 8, "accuracy": 0.39666666666666667, "desc": "test_balanced"}
{"epoch": 9, "accuracy": 0.4266666666666667, "desc": "test_balanced"}
{"epoch": 10, "accuracy": 0.43666666666666665, "desc": "test_balanced"}
{"epoch": 11, "accuracy": 0.48, "desc": "test_balanced"}
{"epoch": 12, "accuracy": 0.48333333333333334, "desc": "test_balanced"}
{"epoch": 13, "accuracy": 0.48333333333333334, "desc": "test_balanced"}
{"epoch": 14, "accuracy": 0.47333333333333333, "desc": "test_balanced"}
{"epoch": 15, "accuracy": 0.4766666666666667, "desc": "test_balanced"}
{"epoch": 16, "accuracy": 0.46, "desc": "test_balanced"}
{"epoch": 17, "accuracy": 0.48, "desc": "test_balanced"}
{"epoch": 18, "accuracy": 0.49, "desc": "test_balanced"}
{"epoch": 19, "accuracy": null, "desc": "test_balanced"}
{"epoch": 20, "accuracy": 0.5133333333333333, "desc": "test_balanced"}
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
{"epoch": 1, "accuracy": 0.4766666666666667, "desc": "test_balanced"}
{"epoch": 2, "accuracy": 0.5133333333333333, "desc": "test_balanced"}
{"epoch": 3, "accuracy": 0.52, "desc": "test_balanced"}
{"epoch": 4, "accuracy": 0.52, "desc": "test_balanced"}
{"epoch": 5, "accuracy": 0.54, "desc": "test_balanced"}
{"epoch": 6, "accuracy": 0.5133333333333333, "desc": "test_balanced"}
{"epoch": 7, "accuracy": 0.5533333333333333, "desc": "test_balanced"}
{"epoch": 8, "accuracy": 0.5366666666666666, "desc": "test_balanced"}
{"epoch": 9, "accuracy": 0.5266666666666666, "desc": "test_balanced"}
{"epoch": 10, "accuracy": 0.5433333333333333, "desc": "test_balanced"}
{"epoch": 11, "accuracy": 0.5333333333333333, "desc": "test_balanced"}
{"epoch": 12, "accuracy": 0.5166666666666667, "desc": "test_balanced"}
{"epoch": 13, "accuracy": 0.5066666666666667, "desc": "test_balanced"}
{"epoch": 14, "accuracy": 0.5233333333333333, "desc": "test_balanced"}
{"epoch": 15, "accuracy": 0.5166666666666667, "desc": "test_balanced"}
{"epoch": 16, "accuracy": 0.49333333333333335, "desc": "test_balanced"}
{"epoch": 17, "accuracy": 0.5033333333333333, "desc": "test_balanced"}
{"epoch": 18, "accuracy": 0.5066666666666667, "desc": "test_balanced"}
{"epoch": 19, "accuracy": 0.49666666666666665, "desc": "test_balanced"}
{"epoch": 20, "accuracy": 0.49333333333333335, "desc": "test_balanced"}
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
{"epoch": 1, "accuracy": 0.33, "desc": "test_balanced"}
{"epoch": 2, "accuracy": 0.38, "desc": "test_balanced"}
{"epoch": 3, "accuracy": 0.36, "desc": "test_balanced"}
{"epoch": 4, "accuracy": 0.4066666666666667, "desc": "test_balanced"}
{"epoch": 5, "accuracy": 0.41, "desc": "test_balanced"}
{"epoch": 6, "accuracy": 0.45666666666666667, "desc": "test_balanced"}
{"epoch": 7, "accuracy": 0.44, "desc": "test_balanced"}
{"epoch": 8, "accuracy": 0.4066666666666667, "desc": "test_balanced"}
{"epoch": 9, "accuracy": 0.4033333333333333, "desc": "test_balanced"}
{"epoch": 10, "accuracy": 0.4066666666666667, "desc": "test_balanced"}
{"epoch": 11, "accuracy": 0.4066666666666667, "desc": "test_balanced"}
{"epoch": 12, "accuracy": 0.4166666666666667, "desc": "test_balanced"}
{"epoch": 13, "accuracy": 0.39666666666666667, "desc": "test_balanced"}
{"epoch": 14, "accuracy": 0.4166666666666667, "desc": "test_balanced"}
{"epoch": 15, "accuracy": 0.4166666666666667, "desc": "test_balanced"}
{"epoch": 16, "accuracy": 0.41, "desc": "test_balanced"}
{"epoch": 17, "accuracy": 0.4033333333333333, "desc": "test_balanced"}
{"epoch": 18, "accuracy": 0.3933333333333333, "desc": "test_balanced"}
{"epoch": 19, "accuracy": 0.4, "desc": "test_balanced"}
{"epoch": 20, "accuracy": 0.41, "desc": "test_balanced"}
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
{"epoch": 1, "accuracy": 0.45666666666666667, "desc": "test_balanced"}
{"epoch": 2, "accuracy": 0.4666666666666667, "desc": "test_balanced"}
{"epoch": 3, "accuracy": 0.4666666666666667, "desc": "test_balanced"}
{"epoch": 4, "accuracy": 0.45666666666666667, "desc": "test_balanced"}
{"epoch": 5, "accuracy": 0.46, "desc": "test_balanced"}
{"epoch": 6, "accuracy": 0.49, "desc": "test_balanced"}
{"epoch": 7, "accuracy": 0.5066666666666667, "desc": "test_balanced"}
{"epoch": 8, "accuracy": 0.5166666666666667, "desc": "test_balanced"}
{"epoch": 9, "accuracy": 0.5, "desc": "test_balanced"}
{"epoch": 10, "accuracy": 0.5133333333333333, "desc": "test_balanced"}
{"epoch": 11, "accuracy": 0.5133333333333333, "desc": "test_balanced"}
{"epoch": 12, "accuracy": 0.5233333333333333, "desc": "test_balanced"}
{"epoch": 13, "accuracy": 0.46, "desc": "test_balanced"}
{"epoch": 14, "accuracy": 0.4766666666666667, "desc": "test_balanced"}
{"epoch": 15, "accuracy": 0.47, "desc": "test_balanced"}
{"epoch": 16, "accuracy": 0.48333333333333334, "desc": "test_balanced"}
{"epoch": 17, "accuracy": 0.47, "desc": "test_balanced"}
{"epoch": 18, "accuracy": 0.47333333333333333, "desc": "test_balanced"}
{"epoch": 19, "accuracy": 0.48, "desc": "test_balanced"}
{"epoch": 20, "accuracy": 0.45, "desc": "test_balanced"}
Loading