Skip to content
This repository was archived by the owner on Jul 7, 2023. It is now read-only.

Commit b9b4dad

Browse files
author
Ryan Sepassi
committed
Reduce num file shards for example PoetryLines problem
PiperOrigin-RevId: 187055251
1 parent b1e7708 commit b9b4dad

File tree

2 files changed

+7
-7
lines changed

2 files changed

+7
-7
lines changed

docs/new_problem.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -65,10 +65,10 @@ class PoetryLines(text_problems.Text2TextProblem):
6565
# 10% evaluation data
6666
return [{
6767
"split": problem.DatasetSplit.TRAIN,
68-
"shards": 90,
68+
"shards": 9,
6969
}, {
7070
"split": problem.DatasetSplit.EVAL,
71-
"shards": 10,
71+
"shards": 1,
7272
}]
7373

7474
def generate_samples(self, data_dir, tmp_dir, dataset_split):
@@ -133,7 +133,7 @@ pre-existing "training" and "evaluation" sets. If we did, we'd set
133133
split.
134134

135135
The `dataset_splits` method determines the fraction that goes to each split. The
136-
training data will be generated into 90 files and the evaluation data into 10.
136+
training data will be generated into 9 files and the evaluation data into 1.
137137
90% of the data will be for training. 10% of the data will be for evaluation.
138138

139139
```python
@@ -148,10 +148,10 @@ training data will be generated into 90 files and the evaluation data into 10.
148148
# 10% evaluation data
149149
return [{
150150
"split": problem.DatasetSplit.TRAIN,
151-
"shards": 90,
151+
"shards": 9,
152152
}, {
153153
"split": problem.DatasetSplit.EVAL,
154-
"shards": 10,
154+
"shards": 1,
155155
}]
156156
```
157157

tensor2tensor/test_data/example_usr_dir/my_submodule.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -56,10 +56,10 @@ def dataset_splits(self):
5656
# 10% evaluation data
5757
return [{
5858
"split": problem.DatasetSplit.TRAIN,
59-
"shards": 90,
59+
"shards": 9,
6060
}, {
6161
"split": problem.DatasetSplit.EVAL,
62-
"shards": 10,
62+
"shards": 1,
6363
}]
6464

6565
def generate_samples(self, data_dir, tmp_dir, dataset_split):

0 commit comments

Comments
 (0)