The CSCS Knowledge Base has been migrated to a new site docs.cscs.ch

The documentation on this page has been migrated to docs.cscs.ch/guides/mlp_tutorials/

The information on this page is out of date, and all new documentation will be written on the new site.

In this tutorial, we will build a container image to run nanotron training jobs. We will train a 109M parameter model with ~100M wikitext tokens as a proof of concept.