Skip to content
This repository was archived by the owner on Jul 7, 2023. It is now read-only.

Commit 406db60

Browse files
ghegorsepassi
authored andcommitted
Added links to files in doc and corrected a few typos (#282)
* better documentation with links * fixed line permalink
1 parent 24071ba commit 406db60

File tree

2 files changed

+7
-8
lines changed

2 files changed

+7
-8
lines changed

README.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -214,8 +214,7 @@ on the task (e.g. fed through a final linear transform to produce logits for a
214214
softmax over classes). All models are imported in
215215
[`models.py`](https://github.com/tensorflow/tensor2tensor/tree/master/tensor2tensor/models/models.py),
216216
inherit from `T2TModel` - defined in
217-
[`t2t_model.py`](https://github.com/tensorflow/tensor2tensor/tree/master/tensor2tensor/utils/t2t_model.py)
218-
- and are registered with
217+
[`t2t_model.py`](https://github.com/tensorflow/tensor2tensor/tree/master/tensor2tensor/utils/t2t_model.py) - and are registered with
219218
[`@registry.register_model`](https://github.com/tensorflow/tensor2tensor/tree/master/tensor2tensor/utils/registry.py).
220219

221220
### Hyperparameter Sets

docs/new_problem.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -15,9 +15,9 @@ Let's add a new dataset together and train the transformer model. We'll be learn
1515

1616
For each problem we want to tackle we create a new problem class and register it. Let's call our problem `Word2def`.
1717

18-
Since many text2text problems share similar methods, there's already a class called `Text2TextProblem` that extends the base problem class, `Problem` (both found in `problem.py`).
18+
Since many text2text problems share similar methods, there's already a class called [`Text2TextProblem`](https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/data_generators/problem.py#L354) that extends the base problem class, `Problem` (both found in [`problem.py`](https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/data_generators/problem.py)).
1919

20-
For our problem, we can go ahead and create the file `word2def.py` in the `data_generators` folder and add our new problem, `Word2def`, which extends `Text2TextProblem`. Let's also register it while we're at it so we can specify the problem through flags.
20+
For our problem, we can go ahead and create the file `word2def.py` in the [`data_generators`](https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/data_generators/) folder and add our new problem, `Word2def`, which extends [`Text2TextProblem`](https://github.com/tensorflow/tensor2tensor/blob/24071ba07d5a14c170044c5e60a24bda8179fb7a/tensor2tensor/data_generators/problem.py#L354). Let's also register it while we're at it so we can specify the problem through flags.
2121

2222
```python
2323
@registry.register_problem
@@ -28,7 +28,7 @@ class Word2def(problem.Text2TextProblem):
2828
...
2929
```
3030

31-
We need to implement the following methods from `Text2TextProblem` in our new class:
31+
We need to implement the following methods from [`Text2TextProblem`](https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/data_generators/problem.py#L354) in our new class:
3232
* is_character_level
3333
* targeted_vocab_size
3434
* generator
@@ -42,7 +42,7 @@ Let's tackle them one by one:
4242

4343
**input_space_id, target_space_id, is_character_level, targeted_vocab_size, use_subword_tokenizer**:
4444

45-
SpaceIDs tell Tensor2Tensor what sort of space the input and target tensors are in. These are things like, EN_CHR (English character), EN_TOK (English token), AUDIO_WAV (audio waveform), IMAGE, DNA (genetic bases). The complete list can be found at `data_generators/problem.py` in the class `SpaceID`.
45+
SpaceIDs tell Tensor2Tensor what sort of space the input and target tensors are in. These are things like, EN_CHR (English character), EN_TOK (English token), AUDIO_WAV (audio waveform), IMAGE, DNA (genetic bases). The complete list can be found at [`data_generators/problem.py`](https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/data_generators/problem.py) in the class `SpaceID`.
4646

4747
Since we're generating definitions and feeding in words at the character level, we set `is_character_level` to true, and use the same SpaceID, EN_CHR, for both input and target. Additionally, since we aren't using tokens, we don't need to give a `targeted_vocab_size` or define `use_subword_tokenizer`.
4848

@@ -86,7 +86,7 @@ class Word2def(problem.Text2TextProblem):
8686

8787
**generator**:
8888

89-
We're almost done. `generator` generates the training and evaluation data and stores them in files like "word2def_train.lang1" in your DATA_DIR. Thankfully several commonly used methods like `character_generator`, and `token_generator` are already written in the file `wmt.py`. We will import `character_generator` and write:
89+
We're almost done. `generator` generates the training and evaluation data and stores them in files like "word2def_train.lang1" in your DATA_DIR. Thankfully several commonly used methods like `character_generator`, and `token_generator` are already written in the file [`wmt.py`](https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/data_generators/wmt.py). We will import `character_generator` and [`text_encoder`](https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/data_generators/text_encoder.py) to write:
9090
```python
9191
def generator(self, data_dir, tmp_dir, train):
9292
character_vocab = text_encoder.ByteTextEncoder()
@@ -151,7 +151,7 @@ _WORD2DEF_TEST_DATASETS = [
151151

152152
## Putting it all together
153153

154-
Now our `word2def.py` file looks like: (with the correct imports)
154+
Now our `word2def.py` file looks like:
155155
```python
156156
""" Problem definition for word to dictionary definition.
157157
"""

0 commit comments

Comments
 (0)