~ / cmdr2

projects: freebird, easy diffusion

hacks: carbon editor, torchruntime, findstarlink

  # filter by: [ posts | worklogs ]
  • #easydiffusion
  • #ui
  • #design
  • #v4

Worked on a few UI design ideas for Easy Diffusion v4. I’ve uploaded the work-in-progress mockups at https://github.com/easydiffusion/files. So far, I’ve mocked out the design for the outer skeleton. That is, the new tabbed interface, the status bar, and the unified main menu. I also worked on how they would look like on mobile devices. It gives me a rough idea of the Vue components that would need to be written, and the surface area that plugins can impact. For e.g. plugins can add a new menu entry only in the Plugins sub-menu.

  • #freebird
  • #vr
  • #ar
  • #blender

Freebird is finally out on sale - https://freebirdxr.com/buy It’s still called an Early Access version, since it needs more work to feel like a cohesive product. It’s already got quite a lot of features, and it’s definitely useful. But I think it’s still missing a few key features, and needs an overall “fine-tuning” of the user experience and interface. So yeah, lots more to do. But it feels good to get something out on sale after nearly 4 years of development. Freebird has already spent 2 years in free public beta, so quite a number of people have already used it.

  • #ai
  • #learning
  • #self-awareness

Today I explored an idea for what might happen if an AI model runs continuously, processing inputs, acting and receiving sensory inputs without interruption. Maybe in a text-adventure game. Instead of responding to isolated prompts, the AI would live in a simulated environment, interacting with its world in real time. The experiment is about observing whether behaviors like an understanding of time, awareness, or even a sense of self could emerge naturally through sustained operation.

  • #ml
  • #transformers
  • #diffusion

Spent a few days learning more about Diffusion models, UNets and Transformers. Wrote a few toy implementations of a denoising diffusion model (following diffusers’ tutorial) and a simple multi-headed self-attention model for next-character prediction (following Karpathy’s video). The non-latent version of the denoising model was trained on the Smithsonian Butterfly dataset, and it successfully generates new butterfly images. But it’s unconditional (i.e. no text prompts), and non-latent (i.e. works directly on the image data, instead of a compressed latent space).

  • #easydiffusion
  • #stable-diffusion
  • #c++

Spent some more time on the v4 experiments for Easy Diffusion (i.e. C++ based, fast-startup, lightweight). stable-diffusion.cpp is missing a few features, which will be necessary for Easy Diffusion’s typical workflow. I wasn’t keen on forking stable-diffusion.cpp, but it’s probably faster to work on a fork for now. For now, I’ve added live preview and per-step progress callbacks (based on a few pending pull-requests on sd.cpp). And protection from GGML_ASSERT killing the entire process. I’ve been looking at the ability to load individual models (like the vae) without needing to reload the entire SD model.

  • #easydiffusion
  • #stable-diffusion

Spent a few days getting a C++ based version of Easy Diffusion working, using stable-diffusion.cpp. I’m working with a fork of stable-diffusion.cpp here, to add a few changes like per-step callbacks, live image previews etc. It doesn’t have a UI yet, and currently hardcodes a model path. It exposes a RESTful API server (written using the Crow C++ library), and uses a simple task manager that runs image generation tasks on a thread. The generated images are available at an API endpoint, and it shows the binary JPEG/PNG image (instead of base64 encoding).

Wrote a simple hex-dumper for analysing dll and executable files. Uses pefile.

  • #car
  • #simulation
  • #game
  • #drs
  • #featured

Continuing on the race car simulator series. Last week, the “effective tire friction” calculation was implemented, which modeled the grip at the point of contact between the tire and the road surface. This intentionally did not take into account the vertical load (or any other forces), since the purpose was limited to calculating the “effective” friction coefficient based on the material conditions. The next step was implemented yesterday, which calculates the effective force the tire will apply on the wheel axle, in reaction to the torque applied by the engine on the wheel axle. That reaction force will cause the car to move forward. It also factors in the existing inertial force (i.e. if the car is already moving) in order to model sideways slip (e.g. for drifting).

  • #findstarlink

Following up on yesterday’s post, there’s now full automation for the conversion of provisional NORAD IDs to the official one (once they’re available in Celestrak). This automation is still waiting to be deployed, because it needs to be tested with the official NORAD IDs for yesterday’s Starlink launch (G6-77), once they’re assigned next week. This automation has been now been deployed. So now, the only processes still done manually are (a) selecting a new leader for a train, if the current leader drifts away from the train, and (b) removing old trains that have spread out completely.

  • #findstarlink

Spent two days automating some of the processes around findstarlink.com, and updating some of the code that had started bit-rotting. Most of FindStarlink’s operations run as individual AWS Lambda functions, that are triggered periodically by CloudWatch Events (and Schedules). But a few processes are still done manually, mainly due to a mix of laziness and also being a bit tricky to automate. I also needed to migrate the existing automations to a newer NodeJS runtime in AWS Lambda, since the current runtime was nearing end-of-life support.