#

zero-1

Here is 1 public repository matching this topic...

xrsrke / pipegoose

Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*

transformers moe data-parallelism distributed-optimizers model-parallelism megatron mixture-of-experts pipeline-parallelism huggingface-transformers megatron-lm tensor-parallelism large-scale-language-modeling 3d-parallelism zero-1 sequence-parallelism

Updated Dec 14, 2023
Python

Improve this page

Add a description, image, and links to the zero-1 topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the zero-1 topic, visit your repo's landing page and select "manage topics."