Getting started
The library and a short documentation are available on the GitHub repository. Three minimal walk-throughs are provided as standalone HTML files: quickstart 1, quickstart 2, and quickstart 3. They are the best place to start.
Below are more advanced examples.
GPT (default style)
Original GPT architecture: Radford et al., Improving Language Understanding by Generative Pre-Training
sequence length
sequence length × vocabulary size
GPT (stylized)
sequence length
sequence length × vocabulary size
Transformer (stylized)
Original Transformer architecture: Vaswani et al., Attention Is All You Need
Layout and colors inspired by dair-ai/ml-visuals/2.png
Forward
Forward
Attention
Attention
Multi-Head
Attention
Encoding
Encoding
Textual Inversion (default style)
Original Textual Inversion paper: Gal et al., An Image is Worth One Word
VETIM (VETIM paper style)
Original VETIM architecture: Everaert et al., VETIM: Expanding the Vocabulary of Text-to-Image Models only with Text
A rendering of on a black background.
generation
module
(e.g. diffusion model)
image
images
A rendering of an object on a black background. The object is a twisted, abstract sculpture made of delicate, interlocking tendrils of glass.
Textual Inversion (VETIM paper style)
Original Textual Inversion paper: Gal et al., An Image is Worth One Word
Layout and colors after Everaert et al., VETIM: Expanding the Vocabulary of Text-to-Image Models only with Text
An illustration of a
generation
module
(e.g. diffusion model)
image
images
BibTeX
@misc{everaert2026mlfigtemplate,
author = {{EPFL-IVRL} and Everaert, Martin Nicolas},
title = {{ML} {F}igure {T}emplate: {I}nteractive architecture diagrams for {ML} project pages},
year = {2026},
howpublished = {\url{https://github.com/IVRL/ml-figure-template}}
}