~ / cmdr2

projects: freebird, easy diffusion

hacks: carbon editor, torchruntime, findstarlink

  # filter by: [ posts | worklogs ]
  • #rocm
  • #pytorch
  • #easydiffusion
  • #torchruntime

Continued from Part 1. Spent a few days figuring out how to compile binary wheels of PyTorch and include all the necessary libraries (ROCm libs or CUDA libs). tl;dr - In Part 2, the compiled PyTorch wheels now include the required libraries (including ROCm). But this isn’t over yet. Torch starts now, but adding two numbers with it produces garbage values (on the GPU). There’s probably a bug in the included ROCBLAS version, might need to recompile ROCBLAS for gfx803 separately. Will tackle that in Part 3 (tbd).

  • #rocm
  • #pytorch
  • #easydiffusion
  • #torchruntime

Continued in Part 2, where I figured out how to include the required libraries in the wheel. Spent all of yesterday trying to compile pytorch with the compile-time PYTORCH_ROCM_ARCH=gfx803 environment variable. tl;dr - In Part 1, I compiled wheels for PyTorch with ROCm, in order to add support for older AMD cards like RX 480. I managed to compile the wheels, but the wheel doesn’t include the required ROCm libraries. I figured that out in Part 2.

  • #easydiffusion
  • #torchruntime
  • #torch
  • #ml

Spent the last few days writing torchruntime, which will automatically install the correct torch distribution based on the user’s OS and graphics card. This package was written by extracting this logic out of Easy Diffusion, and refactoring it into a cleaner implementation (with tests). It can be installed (on Win/Linux/Mac) using pip install torchruntime. The main intention is that it’ll be easier for developers to contribute updates (for e.g. for newer or older GPUs). It wasn’t easy to find or modify this code previously, since it was buried deep inside Easy Diffusion’s internals.

  • #easydiffusion
  • #amd
  • #directml

Spent most of the day doing some support work for Easy Diffusion, and experimenting with torch-directml for AMD support on Windows. From the initial experiments, torch-directml seems to work properly with Easy Diffusion. I ran it on my NVIDIA card, and another user ran it on their AMD Radeon RX 7700 XT. It’s 7-10x faster than the CPU, so looks promising. It’s 2x slower than CUDA on my NVIDIA card, but users with NVIDIA cards are not the target audience of this change.

  • #easydiffusion
  • #ui
  • #v4

Spent a few days prototyping a UI for Easy Diffusion v4. Files are at this repo. The main focus was to get a simple but pluggable UI, that was backed by a reactive data model, and to allow splitting the codebase into individual components (with their own files). And require only a text editor and a browser to develop, i.e. no compilation or nodejs-based developer experiences.

  • #vr
  • #ui
  • #freebird

Really need to figure out a way to render standard HTML elements (styled with CSS and modified with JS) in a 3D scene. Reinventing excellent libraries like PrimeVue again inside 3D (for rendering in VR) is just wasteful. There have been attempts, e.g. A-Frame, but we really need to view the webpage in 3D. Just regular HTML elements. The regular DOM renderer. The pieces feel like they’re there conceptually, but the implementation gap is probably big enough (that it hasn’t happened yet).

  • #c++
  • #imgui
  • #browser

A simple browser-like shell using ImGui and GLFW. It was supposed to show a webview, but I couldn’t figure out how to embed a webview in the window (instead of it popping up in its own window). Maybe I’ll revisit this in the future if I can figure it out. Create a folder named thirdparty (alongside main.cpp and CMakeLists.txt) and clone the git repositories for imgui and glfw into the thirdparty folder. Then compile using:

  • #findstarlink
  • #ai
  • #llm

I spent some time today doing support for Freebird, Puppetry and Easy Diffusion. Identified a bug in Freebird (bone axis gizmos aren’t scaling correctly in VR), got annoyed by how little documentation I’ve written for Puppetry’s scripting API, and got reminded about how annoying it is for Easy Diffusion to force-download the poor quality starter model (stock SD 1.4) during installation. The majority of the day was spent in using a local LLM for classifying emails. I get a lot of repetitive emails for FindStarlink - people telling me whether they saw Starlink or not (using the predictions on the website). The first part of my reply is always a boilerplate “Glad you saw it” or “Sorry about that”, followed by email-specific replies. I’d really like the system to auto-fill the first part of the email, if it’s a report about Starlink sighting.

  • #ai
  • #ml
  • #llm

Built two experiments using locally-hosted LLMs. One is a script that lets two bots chat with each other endlessly. The other is a browser bookmarklet that summarizes the selected text in 300 words or less. Both use an OpenAI-compatible API, so they can be pointed at regular OpenAI-compatible remote servers, or your own locally-hosted servers (like LMStudio). Bot Chat Summarize Bookmarklet The bot chat script is interesting, but the conversation starts stagnating/repeating after 20-30 messages. The conversation is definitely very interesting initially. The script lets you define the names and descriptions of the two bots, the scene description, and the first message by the first bot. After that, it lets the two bots talk to each other endlessly.

  • #easydiffusion
  • #v4
  • #ui

Notes on two directions for ED4’s UI that I’m unlikely to continue on. One is to start a desktop app with a full-screen webview (for the app UI). The other is writing the tabbed browser-like shell of ED4 in a compiled language (like Go or C++) and loading the contents of the tabs as regular webpages (by using webviews). So it would load URLs like http://localhost:9000/ui/image_editor and http://localhost:9000/ui/settings etc.