DeepSpeed runtime study
The runtime includes these parts:
The entry for the runtime is in https://github.com/microsoft/DeepSpeed/blob/master/deepspeed/runtime/engine.py
Elasticity? not supported with model parallelism
DeepSpeed uses gradient accumulation to extract pipeline parallelism.