Over the last few years, autoregressive Transformers have brought a steady stream of breakthroughs in generative modeling. These models generate each element of a sample – the pixels of an image, the characters of a text (typically in “token” chunks), the samples of an audio waveform, and so on – by predicting one element after…
