Releases
v3.3.7
Compare
Sorry, something went wrong.
No results found
docs: remove useless comments
fix(Transformer): too much things to tell
feat: even more precise floating point for metrics and loss
refactor: special tokens now passed via init for Transformer
feat: enhance beam search and token prediction mechanisms
docs: update readme
fix(Transformer): vanishing gradient fix
fix(Transformer): still on it (wip)
fix(Transformer): another fix
fix(Transformer): special token indices
fix(Transformer): normalization IS the issue
docs: update readme
fix(Transformer): cross attention weights
fix: LearningRateScheduler
fix: LearningRateScheduler
fix: normalization in data preparation
fix: different vocab size for different tokenizations
fix(PositionalEncoding): scaling
fix(AddNorm): better normalization
fix(TransformerEncoderLayer): huge improvements
perf(SequenceCrossEntropy): add vectorization
fix(Tokenizer+Transformer): tokenization alignement for special tokens
fix(transformer): investigate and address gradient instability and explosion
fix(sce): label smoothing
refactor: gradient clipping
fix(Transformer): gradient explosion
fix(Transformer): tokens padding and max sequence
test: tried with a better dataset
fix(sce): y_pred treated as logits instead of probs
fix(TransformerEncoderLayer): remove arbitrary scaling
fix(Transformer): sce won't ignore sos and eos tokens
fix: sce extending lossfunction
fix(sce): softmax not necessary
feat: add BLEU, ROUGE-L and ROUGE-N scores
fix: validation data in fit method and shuffle in train_test_split
docs: modifies example to use validation split and bleu score
fix(PositionalEncoding): better positional scaling
ci: bump version to 3.3.7
You can’t perform that action at this time.