Skip to content

Commit 84c0f72

Browse files
authored
Merge pull request #333 from RobotSail/deepspeed-cpu-install-instructions
docs: include docs on installing deepspeed w/ cpuadam
2 parents e19c744 + a6e1f77 commit 84c0f72

File tree

1 file changed

+30
-1
lines changed

1 file changed

+30
-1
lines changed

README.md

Lines changed: 30 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -121,7 +121,36 @@ allow you to customize aspects of the ZeRO stage 2 optimizer.
121121

122122
For more information about DeepSpeed, see [deepspeed.ai](https://www.deepspeed.ai/)
123123

124-
#### `FSDPOptions`
124+
#### DeepSpeed with CPU Offloading
125+
126+
To use DeepSpeed with CPU offloading, you'll usually encounter an issue indicating that the optimizer needed to use the Adam optimizer on CPU doesn't exist. To resolve this, please follow the following steps:
127+
128+
**Rebuild DeepSpeed with CPUAdam**:
129+
130+
You'll need to rebuild DeepSpeed in order for the optimizer to be present:
131+
132+
```bash
133+
# uninstall deepspeed & reinstall with the flags for installing CPUAdam
134+
pip uninstall deepspeed
135+
DS_BUILD_CPU_ADAM=1 DS_BUILD_UTILS=1 pip install deepspeed --no-deps
136+
```
137+
138+
**Ensure `-lcurand` is linked correctly**:
139+
140+
A problem that we commonly encounter is that the `-lcurand` linker will not be present when
141+
DeepSpeed recompiles. To resolve this, you will need to find the location of the `libcurand.so` file in your machine:
142+
143+
```bash
144+
find / -name 'libcurand*.so*' 2>/dev/null
145+
```
146+
147+
If libcurand.so is not present in the `/usr/lib64` directory, you'll need to add a symlink. Ex:
148+
149+
```bash
150+
sudo ln -s /usr/local/cuda/lib64/libcurand.so.10 /usr/lib64/libcurand.so
151+
```
152+
153+
### `FSDPOptions`
125154

126155
Like DeepSpeed, we only expose a number of parameters for you to modify with FSDP.
127156
They are listed below:

0 commit comments

Comments
 (0)