|
1 | | -=============================== |
2 | | -SageMaker TensorFlow Containers |
3 | | -=============================== |
| 1 | +===================================== |
| 2 | +SageMaker TensorFlow Training Toolkit |
| 3 | +===================================== |
4 | 4 |
|
5 | | -SageMaker TensorFlow Containers is an open source library for making the |
6 | | -TensorFlow framework run on `Amazon SageMaker <https://aws.amazon.com/documentation/sagemaker/>`__. |
| 5 | +SageMaker TensorFlow Training Toolkit is an open-source library for using TensorFlow to train models on Amazon SageMaker. |
7 | 6 |
|
8 | | -This repository also contains Dockerfiles which install this library, TensorFlow, and dependencies |
9 | | -for building SageMaker TensorFlow images. |
| 7 | +For inference, see `SageMaker TensorFlow Inference Toolkit <https://github.com/aws/sagemaker-tensorflow-serving-container>`__. |
10 | 8 |
|
11 | | -For information on running TensorFlow jobs on SageMaker: `Python |
12 | | -SDK <https://github.com/aws/sagemaker-python-sdk>`__. |
| 9 | +For the Dockerfiles used for building SageMaker TensorFlow Containers, see `AWS Deep Learning Containers <https://github.com/aws/deep-learning-containers>`__. |
| 10 | + |
| 11 | +For information on running TensorFlow jobs on Amazon SageMaker, please refer to the `SageMaker Python SDK documentation <https://github.com/aws/sagemaker-python-sdk>`__. |
13 | 12 |
|
14 | 13 | For notebook examples: `SageMaker Notebook |
15 | 14 | Examples <https://github.com/awslabs/amazon-sagemaker-examples>`__. |
16 | 15 |
|
17 | | -Table of Contents |
18 | | ------------------ |
19 | | - |
20 | | -#. `Getting Started <#getting-started>`__ |
21 | | -#. `Building your Image <#building-your-image>`__ |
22 | | -#. `Running the tests <#running-the-tests>`__ |
23 | | - |
24 | | -Getting Started |
25 | | ---------------- |
26 | | - |
27 | | -Prerequisites |
28 | | -~~~~~~~~~~~~~ |
29 | | - |
30 | | -Make sure you have installed all of the following prerequisites on your |
31 | | -development machine: |
32 | | - |
33 | | -- `Docker <https://www.docker.com/>`__ |
34 | | - |
35 | | -For Testing on GPU |
36 | | -^^^^^^^^^^^^^^^^^^ |
37 | | - |
38 | | -- `Nvidia-Docker <https://github.com/NVIDIA/nvidia-docker>`__ |
39 | | - |
40 | | -Recommended |
41 | | -^^^^^^^^^^^ |
42 | | - |
43 | | -- A Python environment management tool. (e.g. |
44 | | - `PyEnv <https://github.com/pyenv/pyenv>`__, |
45 | | - `VirtualEnv <https://virtualenv.pypa.io/en/stable/>`__) |
46 | | - |
47 | | -Building your Image |
48 | | -------------------- |
49 | | - |
50 | | -`Amazon SageMaker <https://aws.amazon.com/documentation/sagemaker/>`__ |
51 | | -utilizes Docker containers to run all training jobs & inference endpoints. |
52 | | - |
53 | | -The Docker images are built from the Dockerfiles specified in |
54 | | -`Docker/ <https://github.com/aws/sagemaker-tensorflow-containers/tree/master/docker>`__. |
55 | | - |
56 | | -The Docker files are grouped based on TensorFlow version and separated |
57 | | -based on Python version and processor type. |
58 | | - |
59 | | -The Docker files for TensorFlow 2.0 are available in the |
60 | | -`tf-2 <https://github.com/aws/sagemaker-tensorflow-container/tree/tf-2>`__ branch, in |
61 | | -`docker/2.0.0/ <https://github.com/aws/sagemaker-tensorflow-container/tree/tf-2/docker/2.0.0>`__. |
62 | | - |
63 | | -The Docker images, used to run training & inference jobs, are built from |
64 | | -both corresponding "base" and "final" Dockerfiles. |
65 | | - |
66 | | -Base Images |
67 | | -~~~~~~~~~~~ |
68 | | - |
69 | | -The "base" Dockerfile encompass the installation of the framework and all of the dependencies |
70 | | -needed. It is needed before building image for TensorFlow 1.8.0 and before. |
71 | | -Building a base image is not required for images for TensorFlow 1.9.0 and onwards. |
72 | | - |
73 | | -Tagging scheme is based on <tensorflow_version>-<processor>-<python_version>. (e.g. 1.4 |
74 | | -.1-cpu-py2) |
75 | | - |
76 | | -All "final" Dockerfiles build images using base images that use the tagging scheme |
77 | | -above. |
78 | | - |
79 | | -Before building these images, you need to have a pip-installable binary of this repository saved locally. To create the SageMaker Tensorflow Container Python package: |
80 | | - |
81 | | -:: |
82 | | - # Create the binary |
83 | | - git clone https://github.com/aws/sagemaker-tensorflow-container.git |
84 | | - cd sagemaker-tensorflow-container |
85 | | - python setup.py sdist |
86 | | - cp dist/sagemaker_tensorflow_training*.tar.gz docker/<tensorflow_version>/sagemaker_tensorflow_training.tar.gz |
87 | | - |
88 | | -Once you have copied the tensorflow_training.tar.gz to the desired location [same directory as the Dockerfile], you can then build the image. |
89 | | - |
90 | | -If you want to build your "base" Docker image, then use: |
91 | | - |
92 | | -:: |
93 | | - |
94 | | - # All build instructions assume you're building from the same directory as the Dockerfile. |
95 | | - |
96 | | - # CPU |
97 | | - docker build -t tensorflow-base:<tensorflow_version>-cpu-<python_version> -f Dockerfile.cpu . |
98 | | - |
99 | | - # GPU |
100 | | - docker build -t tensorflow-base:<tensorflow_version>-gpu-<python_version> -f Dockerfile.gpu . |
101 | | - |
102 | | -:: |
103 | | - |
104 | | - # Example |
105 | | - |
106 | | - # CPU |
107 | | - docker build -t tensorflow-base:1.4.1-cpu-py2 -f Dockerfile.cpu . |
108 | | - |
109 | | - # GPU |
110 | | - docker build -t tensorflow-base:1.4.1-gpu-py2 -f Dockerfile.gpu . |
111 | | - |
112 | | -Final Images |
113 | | -~~~~~~~~~~~~ |
114 | | - |
115 | | -The "final" Dockerfiles encompass the installation of the SageMaker specific support code. |
116 | | - |
117 | | -For images of TensorFlow 1.8.0 and before, all "final" Dockerfiles use `base images for building <https://github |
118 | | -.com/aws/sagemaker-tensorflow-containers/blob/master/docker/1.4.1/final/py2/Dockerfile.cpu#L2>`__. |
119 | | - |
120 | | -These "base" images are specified with the naming convention of |
121 | | -tensorflow-base:<tensorflow_version>-<processor>-<python_version>. |
122 | | - |
123 | | -Before building "final" images: |
124 | | - |
125 | | -Build your "base" image. Make sure it is named and tagged in accordance with your "final" |
126 | | -Dockerfile. Skip this step if you want to build image of Tensorflow Version 1.9.0 and above. |
127 | | - |
128 | | -If you want to build "final" Docker images, for versions 1.6 and above, you will first need to download the appropriate tensorflow pip wheel, then pass in its location as a build argument. These can be obtained from pypi. For example, the files for 1.6.0 are here: |
129 | | - |
130 | | -https://pypi.org/project/tensorflow/1.6.0/#files |
131 | | -https://pypi.org/project/tensorflow-gpu/1.6.0/#files |
132 | | - |
133 | | -Note that you need to use the tensorflow-gpu wheel when building the GPU image. |
134 | | - |
135 | | -Then run: |
136 | | - |
137 | | -:: |
138 | | - |
139 | | - # All build instructions assumes you're building from the same directory as the Dockerfile. |
140 | | - |
141 | | - # CPU |
142 | | - docker build -t <image_name>:<tag> --build-arg py_version=<py_version> --build-arg framework_installable=<path to tensorflow binary> -f Dockerfile.cpu . |
143 | | - |
144 | | - # GPU |
145 | | - docker build -t <image_name>:<tag> --build-arg py_version=<py_version> --build-arg framework_installable=<path to tensorflow binary> -f Dockerfile.gpu . |
146 | | - |
147 | | -:: |
148 | | - |
149 | | - # Example |
150 | | - docker build -t preprod-tensorflow:1.6.0-cpu-py2 --build-arg py_version=2 |
151 | | - --build-arg framework_installable=tensorflow-1.6.0-cp27-cp27mu-manylinux1_x86_64.whl -f Dockerfile.cpu . |
152 | | - |
153 | | -The dockerfiles for 1.4 and 1.5 build from source instead, so when building those, you don't need to download the wheel beforehand: |
154 | | - |
155 | | -:: |
156 | | - |
157 | | - # All build instructions assumes you're building from the same directory as the Dockerfile. |
158 | | - |
159 | | - # CPU |
160 | | - docker build -t <image_name>:<tag> -f Dockerfile.cpu . |
161 | | - |
162 | | - # GPU |
163 | | - docker build -t <image_name>:<tag> -f Dockerfile.gpu . |
164 | | - |
165 | | -:: |
166 | | - |
167 | | - # Example |
168 | | - |
169 | | - # CPU |
170 | | - docker build -t preprod-tensorflow:1.4.1-cpu-py2 -f Dockerfile.cpu . |
171 | | - |
172 | | - # GPU |
173 | | - docker build -t preprod-tensorflow:1.4.1-gpu-py2 -f Dockerfile.gpu . |
174 | | - |
175 | | - |
176 | | -Running the tests |
177 | | ------------------ |
178 | | - |
179 | | -Running the tests requires installation of the SageMaker TensorFlow Container code and its test |
180 | | -dependencies. |
181 | | - |
182 | | -:: |
183 | | - |
184 | | - git clone https://github.com/aws/sagemaker-tensorflow-containers.git |
185 | | - cd sagemaker-tensorflow-containers |
186 | | - pip install -e .[test] |
187 | | - |
188 | | -Tests are defined in |
189 | | -`test/ <https://github.com/aws/sagemaker-tensorflow-containers/tree/master/test>`__ |
190 | | -and include unit, integration and functional tests. |
191 | | - |
192 | | -Unit Tests |
193 | | -~~~~~~~~~~ |
194 | | - |
195 | | -If you want to run unit tests, then use: |
196 | | - |
197 | | -:: |
198 | | - |
199 | | - # All test instructions should be run from the top level directory |
200 | | - |
201 | | - pytest test/unit |
202 | | - |
203 | | -Integration Tests |
204 | | -~~~~~~~~~~~~~~~~~ |
205 | | - |
206 | | -Running integration tests require `Docker <https://www.docker.com/>`__ and `AWS |
207 | | -credentials <https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/setup-credentials.html>`__, |
208 | | -as the integration tests make calls to a couple AWS services. The integration and functional |
209 | | -tests require configurations specified within their respective |
210 | | -`conftest.py <https://github.com/aws/sagemaker-tensorflow-containers/blob/master/test/integration/conftest.py>`__.Make sure to update the account-id and region at a minimum. |
211 | | - |
212 | | -Integration tests on GPU require `Nvidia-Docker <https://github.com/NVIDIA/nvidia-docker>`__. |
213 | | - |
214 | | -Before running integration tests: |
215 | | - |
216 | | -#. Build your Docker image. |
217 | | -#. Pass in the correct pytest arguments to run tests against your Docker image. |
218 | | - |
219 | | -If you want to run local integration tests, then use: |
220 | | - |
221 | | -:: |
222 | | - |
223 | | - # Required arguments for integration tests are found in test/integ/conftest.py |
224 | | - |
225 | | - pytest test/integration --docker-base-name <your_docker_image> \ |
226 | | - --tag <your_docker_image_tag> \ |
227 | | - --framework-version <tensorflow_version> \ |
228 | | - --processor <cpu_or_gpu> |
229 | | - |
230 | | -:: |
231 | | - |
232 | | - # Example |
233 | | - pytest test/integration --docker-base-name preprod-tensorflow \ |
234 | | - --tag 1.0 \ |
235 | | - --framework-version 1.4.1 \ |
236 | | - --processor cpu |
237 | | - |
238 | | -Functional Tests |
239 | | -~~~~~~~~~~~~~~~~ |
240 | | - |
241 | | -Functional tests are removed from the current branch, please see them in older branch `r1.0 <https://github.com/aws/sagemaker-tensorflow-container/tree/r1.0#functional-tests>`__. |
242 | | - |
243 | 16 | Contributing |
244 | 17 | ------------ |
245 | 18 |
|
246 | 19 | Please read |
247 | | -`CONTRIBUTING.md <https://github.com/aws/sagemaker-tensorflow-containers/blob/master/CONTRIBUTING.md>`__ |
| 20 | +`CONTRIBUTING.md <https://github.com/aws/sagemaker-tensorflow-training-toolkit/blob/master/CONTRIBUTING.md>`__ |
248 | 21 | for details on our code of conduct, and the process for submitting pull |
249 | 22 | requests to us. |
250 | 23 |
|
251 | 24 | License |
252 | 25 | ------- |
253 | 26 |
|
254 | | -SageMaker TensorFlow Containers is licensed under the Apache 2.0 License. It is copyright 2018 |
| 27 | +SageMaker TensorFlow Training Toolkit is licensed under the Apache 2.0 License. It is copyright 2018 |
255 | 28 | Amazon.com, Inc. or its affiliates. All Rights Reserved. The license is available at: |
256 | 29 | http://aws.amazon.com/apache2.0/ |
0 commit comments