Skip to content
This repository was archived by the owner on Jul 7, 2023. It is now read-only.

Commit 08f0742

Browse files
Lukasz KaiserRyan Sepassi
authored andcommitted
Add a quick MNIST model for README, comments and documentation, bump back version.
PiperOrigin-RevId: 185311943
1 parent 70a0464 commit 08f0742

File tree

7 files changed

+81
-12
lines changed

7 files changed

+81
-12
lines changed

README.md

Lines changed: 17 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,8 @@ or running existing ones on your data. It is actively used and maintained by
1919
researchers and engineers within
2020
the [Google Brain team](https://research.google.com/teams/brain/) and was used
2121
to develop state-of-the-art models for translation (see
22-
[Attention Is All You Need](https://arxiv.org/abs/1706.03762)), summarization,
22+
[Attention Is All You Need](https://arxiv.org/abs/1706.03762)),
23+
[summarization](https://arxiv.org/abs/1801.10198),
2324
image generation and other tasks. You can read
2425
more about T2T in the [Google Research Blog post introducing
2526
it](https://research.googleblog.com/2017/06/accelerating-deep-learning-research.html).
@@ -42,7 +43,20 @@ with T2T announcements.
4243
browser using a free VM from Google, no installation needed.
4344

4445
Alternatively, here is a one-command version that installs T2T, downloads data,
45-
trains an English-German translation model, and evaluates it:
46+
trains an MNIST model and evaluates it:
47+
48+
```
49+
pip install tensor2tensor && t2t-trainer \
50+
--generate_data \
51+
--data_dir=~/t2t_data \
52+
--problems=image_mnist \
53+
--model=shake_shake \
54+
--hparams_set=shake_shake_quick \
55+
--output_dir=~/t2t_train/mnist1
56+
```
57+
58+
For a more demanding problem, here is how to train
59+
an English-German translation model and evaluate it:
4660

4761
```
4862
pip install tensor2tensor && t2t-trainer \
@@ -54,7 +68,7 @@ pip install tensor2tensor && t2t-trainer \
5468
--output_dir=~/t2t_train/base
5569
```
5670

57-
You can decode from the model interactively:
71+
You can decode from the model interactively to get translations:
5872

5973
```
6074
t2t-decoder \

docs/walkthrough.md

Lines changed: 17 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,8 @@ or running existing ones on your data. It is actively used and maintained by
1919
researchers and engineers within
2020
the [Google Brain team](https://research.google.com/teams/brain/) and was used
2121
to develop state-of-the-art models for translation (see
22-
[Attention Is All You Need](https://arxiv.org/abs/1706.03762)), summarization,
22+
[Attention Is All You Need](https://arxiv.org/abs/1706.03762)),
23+
[summarization](https://arxiv.org/abs/1801.10198),
2324
image generation and other tasks. You can read
2425
more about T2T in the [Google Research Blog post introducing
2526
it](https://research.googleblog.com/2017/06/accelerating-deep-learning-research.html).
@@ -42,7 +43,20 @@ with T2T announcements.
4243
browser using a free VM from Google, no installation needed.
4344

4445
Alternatively, here is a one-command version that installs T2T, downloads data,
45-
trains an English-German translation model, and evaluates it:
46+
trains an MNIST model and evaluates it:
47+
48+
```
49+
pip install tensor2tensor && t2t-trainer \
50+
--generate_data \
51+
--data_dir=~/t2t_data \
52+
--problems=image_mnist \
53+
--model=shake_shake \
54+
--hparams_set=shake_shake_quick \
55+
--output_dir=~/t2t_train/mnist1
56+
```
57+
58+
For a more demanding problem, here is how to train
59+
an English-German translation model and evaluate it:
4660

4761
```
4862
pip install tensor2tensor && t2t-trainer \
@@ -54,7 +68,7 @@ pip install tensor2tensor && t2t-trainer \
5468
--output_dir=~/t2t_train/base
5569
```
5670

57-
You can decode from the model interactively:
71+
You can decode from the model interactively to get translations:
5872

5973
```
6074
t2t-decoder \

setup.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55

66
setup(
77
name='tensor2tensor',
8-
version='2.0.0',
8+
version='1.5.0',
99
description='Tensor2Tensor',
1010
author='Google Inc.',
1111
author_email='[email protected]',

tensor2tensor/bin/t2t-datagen

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,16 @@
11
#!/usr/bin/env python
2-
"""t2t-datagen."""
2+
"""Data generation for Tensor2Tensor.
3+
4+
This script is used to generate data to train your models
5+
for a number problems for which open-source data is available.
6+
7+
For example, to generate data for MNIST run this:
8+
9+
t2t-datagen \
10+
--problem=image_mnist \
11+
--data_dir=~/t2t_data \
12+
--tmp_dir=~/t2t_data/tmp
13+
"""
314
from __future__ import absolute_import
415
from __future__ import division
516
from __future__ import print_function

tensor2tensor/bin/t2t-trainer

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,21 @@
11
#!/usr/bin/env python
2-
"""t2t-trainer."""
2+
"""Trainer for Tensor2Tensor.
3+
4+
This script is used to train your models in Tensor2Tensor.
5+
6+
For example, to train a shake-shake model on MNIST run this:
7+
8+
t2t-trainer \
9+
--generate_data \
10+
--problems=image_mnist \
11+
--data_dir=~/t2t_data \
12+
--tmp_dir=~/t2t_data/tmp
13+
--model=shake_shake \
14+
--hparams_set=shake_shake_quick \
15+
--output_dir=~/t2t_train/mnist1 \
16+
--train_steps=1000 \
17+
--eval_steps=100
18+
"""
319
from __future__ import absolute_import
420
from __future__ import division
521
from __future__ import print_function

tensor2tensor/models/shake_shake.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -185,6 +185,16 @@ def shakeshake_small():
185185
return hparams
186186

187187

188+
@registry.register_hparams
189+
def shake_shake_quick():
190+
hparams = shakeshake_small()
191+
hparams.optimizer = "Adam"
192+
hparams.learning_rate_cosine_cycle_steps = 1000
193+
hparams.learning_rate = 0.5
194+
hparams.batch_size = 100
195+
return hparams
196+
197+
188198
@registry.register_hparams
189199
def shakeshake_big():
190200
hparams = shakeshake_small()

tensor2tensor/models/transformer.py

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,10 +13,14 @@
1313
# See the License for the specific language governing permissions and
1414
# limitations under the License.
1515

16-
"""transformer (attention).
16+
"""Transformer model from "Attention Is All You Need".
1717
18-
encoder: [Self-Attention, Feed-forward] x n
19-
decoder: [Self-Attention, Source-Target-Attention, Feed-forward] x n
18+
The Transformer model consists of an encoder and a decoder. Both are stacks
19+
of self-attention layers followed by feed-forward layers. This model yields
20+
good results on a number of problems, especially in NLP and machine translation.
21+
22+
See "Attention Is All You Need" (https://arxiv.org/abs/1706.03762) for the full
23+
description of the model and the results obtained with its early version.
2024
"""
2125

2226
from __future__ import absolute_import

0 commit comments

Comments
 (0)