We have put together the complete Transformer model, and now we are ready to train it for neural machine translation. We shall be making use of a training dataset for this purpose, which contains short English and German sentence pairs. We will also be revisiting the role of masking in computing the accuracy and loss metrics during the training process.
In this tutorial, you will discover how to train the Transformer model for neural machine translation.
Read on for the process, including a lot of code.