Hi There, In Chapter 10 - Creating Text Embedding Models from Part III, in the section of Fine-Tuning an Embedding Model on Page 313, I think there is a typo in this sentence
After training our cross-encoder, we use the remaining 400,000 sentence pairs (from
our original dataset of 50,000 sentence pairs) as our silver dataset (step 2):
After taking subset of 10,000 documents, there would have been 40,000 documents as remaining from the original dataset of 50,000 sentence pairs.
This is after the following code sample
# Train a cross-encoder on the gold dataset
cross_encoder = CrossEncoder("bert-base-uncased", num_labels=2)
cross_encoder.fit(
train_dataloader=gold_dataloader,
epochs=1,
show_progress_bar=True,
warmup_steps=100,
use_amp=False
)
Thanks!
Hi There, In Chapter 10 -
Creating Text Embedding Modelsfrom Part III, in the section ofFine-Tuning an Embedding Modelon Page 313, I think there is a typo in this sentenceAfter taking subset of 10,000 documents, there would have been 40,000 documents as remaining from the original dataset of 50,000 sentence pairs.
This is after the following code sample
Thanks!