initial commit: Legal BERT + SVM model for legal case outcome classification #655
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Related Issue
Info about the related issue
CodePeak 2025 Participant
Contributor
Closes: #issue number that will be closed through this PR
Describe the add-ons or changes you've made
Give a clear description of what have you added or modifications made
Added an SVM-based classification model that uses Legal-BERT embeddings from case texts to predict legal case outcomes.
The model groups outcomes into four semantic categories — positive, neutral, negative, and approval — for balanced classification.
Implementation details
Results
Achieved Macro F1 = 0.42 on test data
Improved generalization compared to baseline TF-IDF + SVM model
Type of change
What sort of change have you made:
How Has This Been Tested?
Describe how it has been tested
Describe how have you verified the changes made
The model was tested using train, validation, and test splits along with cross-validation during hyperparameter tuning. Unit tests verified model loading, text preprocessing, and prediction outputs. The changes were validated by comparing results against the previous TF-IDF baseline, showing improved macro F1-scores and consistent predictions across sample legal case texts.
Checklist: