~ / cmdr2

projects: freebird, easy diffusion

hacks: carbon editor, torchruntime, findstarlink

  • #ml
  • #transformers
  • #diffusion

Spent a few days learning more about Diffusion models, UNets and Transformers. Wrote a few toy implementations of a denoising diffusion model (following diffusers’ tutorial) and a simple multi-headed self-attention model for next-character prediction (following Karpathy’s video). The non-latent version of the denoising model was trained on the Smithsonian Butterfly dataset, and it successfully generates new butterfly images. But it’s unconditional (i.e. no text prompts), and non-latent (i.e. works directly on the image data, instead of a compressed latent space).