devlog

Way more detail than you ever wanted to know about the development of the Anukari 3D Physics Synthesizer [see archive]

Finally Anukari has macros, and a preset API

Captain's Log: Stardate 79052

The problem

Anukari has long had a modulation system, with LFOs, host automation controllers, MIDI, etc. But adding modulation to a preset has always been a kind of labor-intensive process. And one big gaping hole in the UX was the lack of a way to bind a knob inside Anukari itself to allow modulation to be controlled via the mouse.

The lack of mouse control was a hassle, but the problems were a bit deeper than that. Because of this issue, interfacing with the DAW was always not quite the way users expected. For example, in FL Studio you can choose an automation via the learn feature by "wiggling" a knob inside a VST plugin. FL Studio watches and sees which knob was wiggled and binds it. But of course with no mouse-controlled knobs inside Anukari, this was not possible.

Furthermore, while it was possible to map host parameters to automations inside Anukari, they could only be controlled via the DAW, which is really inconvenient, and often is a really weird workflow. Users expect to be able to hit "record" in the DAW and then go play the VST, knobs and all, and have everything recorded.

Macros

The solution was to add some knobs inside Anukari that can be mapped to control the modulation system. Those are shown here in the lower right-hand corner:

(The icons and graphics are still provisional while I wait for my designer to improve them.)

There are eight knobs in total (in the screenshot only four are showing, and the other 4 are collapsed). Each knob can be renamed, and corresponds to a mapping that will automatically appear in the DAW. And each knob is connected to any number of 3D Macro objects, which it will control.

This is already really handy, but the killer feature is the little grabby-hand button icon next to each macro knob. When the user drags this, parameters in the right-hand editor panel that can be modulated will automatically be highlighted, and when the user drops onto one of them, a 3D Macro object will be created which is automatically linked to the given parameter on all selected entities. Here's an example:

This is a pretty transformative change. It is dramatically easier to create automations inside Anukari, play with them, and edit them. And then they can be performed with the knob movements recorded in the DAW.

Side benefits

The new macro system addressed a bunch of feedback I repeatedly got from users, and solved a bunch of problems. But in addition to that, there were a number of extra advantages to the new system that came very cheaply.

Having the drag-and-drop system for modulation naturally made it easy to do the same thing for other modulator types. So now, if a user drags in an LFO from the entity palette on the bottom of the screen, they can drag it straight to a highlighted parameter to create an LFO connected to that parameter on the selected entities. This can be done with any modulator and is hugely convenient.

Another big benefit is that now all the built-in automations for all the factory presets are discoverable. Previously with no knobs in the main UI, there was no easy way to see what parameters had been configured for modulation as you cycled through presets. Now you can see them all, and trivially play with them via the mouse. Even better, in the standalone app the 8 macro knobs map to MIDI continuous control parameters 1-8, so on most MIDI controllers you can just turn knobs and things will happen, with visual feedback.

Finally this opens the door for even more interesting drag-and-drop use cases. The first one I have in mind is for creating exciter objects, like Mallets. The idea is that the user will be able to select a bunch of Bodies (masses), and then drag the Mallet object from the palette onto the right-panel (which will be highlighted) and it will automatically create the Mallet and connect it to all the selected Bodies. This will be much more convenient than the workflow today.

Anukari Builder (preset API)

In the Anukari Discord server, the user 312ears is notable for providing extremely helpful feedback about Anukari, obviously borne out of using it in depth. When I first released the Beta, they were one of the people who suggested that it would be cool if it were possible to create presets programmatically, for example via a Python API.

I really wanted to help with this, but for the foreseeable future my time has to be focused on making the plugin itself work better. So I offered to release the Google Protocol Buffer definitions for the preset file format and provide a bit of support on using them, but could not commit to any kind of nice APIs.

Anyway, 312ears took the Protocol Buffer definitions and built an entire Python API for building Anukari presets. Their API can be found here: github.com/312ears/anukaribuilder.

This is an absolutely incredible contribution to the project and community. It allows users who can write Python to create presets that would otherwise be far too tedious to make. On some hardware Anukari supports up to 1,000 physics objects, and arranging them in a complex geometric pattern is difficult with the UI. But with Python all kinds of things become possible. For example, 312ears has shown demos in the Discord server of presets that can shift between two shapes, say a sphere and a pyramid, by turning a MIDI knob. Here's a quick example:

by Evan at 8/2/2025, 7:53:11 PMgui ux

Working better on some Radeon chips

Captain's Log: Stardate 79013.9

The issue with Radeon

As discussed in a previous post, I've been fighting with Radeon mobile chips, specifically the gfx90c. The problem originally presented with a user that had both an NVIDIA and a Radeon chip, and even though they were using the NVIDIA chip for Anukari, somehow in the 0.9.6 release something changed that caused the Radeon drivers to crash internally (i.e. the drivers did not return an error code, they were simply aborting the process).

I'd like to eventually offer official support for Radeon chips. That's still likely a ways off, but at the very least I don't want things crashing. Anukari is extremely careful about how it interacts with the GPU, and when a particular GPU is not usable, it should (preferred) simply pick a different GPU, or at the very least, show a helpful message in the GUI explaining the situation.

Unfortunately it was difficult to debug this issue remotely. The user was kind enough to run an instrumented binary that confirmed that Anukari was calling clBuildProgram() with perfectly valid arguments, and it was simply aborting. I really needed to run Anukari under a debugger to learn more.

So I found out what laptop my bug-reporting user had, and ordered an inexpensive used Lenovo Ideapad 5 on eBay. I've had to buy a lot of testing hardware, and I've saved thousands of dollars by buying it all second-hand or refurbished. In this case it did take two attempts, as the first Ideapad 5 I received was super broken. But the second one works just fine.

Investigation

After getting the laptop set up and running Anukari under the MSVC debugger, I instantly was seeing debug output like this just prior to the driver crash:

LLVM ERROR: Cannot select: 0x1ce8fdea678:
ch = store 0x1ce8fe462a8, 0x1ce8fde8fb8, 0x1ce8fe470b8,
  undef:i32
  0x1ce8fde8fb8: f32,ch = load 0x1ce8fdd7638, 0x1ce8fde8b80,
  undef:i64
    0x1ce8fde8b80: i64 = add 0x1ce8fcbc600, Constant:i64<294>
      0x1ce8fcbc600: i64,ch,glue = LD_64 
        TargetExternalSymbol:i64'arguments', Register:i64 %noreg, 
        TargetConstant:i32<0>, TargetConstant:i32<4>,
        TargetConstant:i32<4>, TargetConstant:i32<8>,
        TargetConstant:i32<34>, TargetConstant:i32<1>, 0x1ce8a26ec90
        0x1ce8fce97f8: i64 = TargetExternalSymbol'arguments'
        0x1ce8fcc3148: i64 = Register %noreg
        0x1ce8fcbc330: i32 = TargetConstant<0>
        0x1ce8fcbbfe8: i32 = TargetConstant<4>
        0x1ce8fcbbfe8: i32 = TargetConstant<4>
        0x1ce8fcbbe08: i32 = TargetConstant<8>
        0x1ce8fcbc768: i32 = TargetConstant<34>
        0x1ce8fcc1f58: i32 = TargetConstant<1>
      0x1ce8fde8b08: i64 = Constant<294>
    0x1ce8fcc2c20: i64 = undef
  0x1ce8fe470b8: i32 = add FrameIndex:i32<30>, Constant:i32<294>
    0x1ce8fdea420: i32 = FrameIndex<30>
    0x1ce8fe47040: i32 = Constant<294>
  0x1ce8fdea498: i32 = undef

First of all, I want to call out AMD on their exceptionally shoddy driver implementation. It's just absurd that they'd allow a compilation error internal to the driver to abort the whole process. Clearly in this case clBuildProgram() should return CL_BUILD_PROGRAM_FAILURE, and the program log (the compiler error text) should be filled with something helpful, at a minimum, the raw LLVM output, but preferably something more readable. This is intern-level code, in a Windows kernel driver. Wow.

After reading through this carefully, all I could really make of it was that LLVM was unable to find a machine instruction to read data from this UpdateEntitiesArguments struct in addrspace=7 and write it to memory in addrspace=1. From context, I could guess that addrspace=1 is private (thread) memory, and addrspace=7 is whatever memory the kernel arguments are stored in. I had a harder time understanding why it couldn't find such an instruction. I thought maybe it had to do with an alignment problem, but wasn't sure.

This struct contains a number of fields, and I couldn't tell from the error which field was the problem. So I just used a brute-force approach and commented out most of the kernel code, and added code back in slowly. It compiled fine until I uncommented a line of code like float x = arguments.field[i]. I did some checking to ensure that field was aligned in a sane way, and after confirming that, I came to the conclusion that the gfx90c chip simply does not have an instruction for loading memory from addrspace=7 with a dynamic offset. In other words, the gfx90c appears to lack the ability to address arrays in argument memory with a non-constant offset.

Which, as far as I can tell, means that the gfx90c really doesn't support OpenCL properly. Every other OpenCL implementation I've used can do this, including NVIDIA, Intel Iris, Apple, and even newer Radeon chips like the gfx1036. I don't see anything in the OpenCL specification that would indicate that this is a limitation.

But even assuming that it's somehow within specs for an OpenCL implementation not to support this feature, obviously aborting in the driver is completely unreasonable behavior. Again, this is a really shoddy implementation, and when people ask about why Anukari doesn't yet officially support Radeon chips, this is the kind of reason that I point to. The drivers are buggy, and worse they are inconsistent across the hardware.

The (very simple) workaround

Anyway, I have very good (performance) reasons for storing some small constant-size arrays (with dynamic indexes) in kernel arguments, but those reasons really apply more to the CUDA backend. So I made some simple changes to Anukari to store these small arrays in constant device memory, and the gfx90c implementation now works just fine.

Given that I upgraded my primary workstation recently to a very new AMD Ryzen CPU, I now have two Radeon test chips: the gfx90c in the Ideapad 5, and the gfx1036 that's built-in to my Ryzen. The Anukari GPU code appears to work flawlessly on both, though doesn't perform all that well on either. Next up will be doing more testing of the Vulkan graphics, which have also been a pain point in the past on Radeon chips.

by Evan at 7/19/2025, 9:48:15 PMgpu bug radeon

Multichannel, ASIO, Radeon, and randomization

Captain's Log: Stardate 79000.1

Whoa, it's been way too long since I updated the devlog. Here goes!

2025 MIDI Innovation Awards

Really quickly: Anukari is an entry in the 2025 MIDI Innovation Awards, and I would really appreciate your vote. You can vote on this page by entering your email and then navigating to the Software Prototypes/Non-Commercial Products category and scrolling way down to find Anukari. You have to pick 3 products to vote in that category. (I wish I could link to the vote page directly, but alas, it's not built that way.)

The prize for winning would be a shared booth for Anukari at the NAMM trade show, which would be a big deal for getting the word out.

Multichannel I/O Support

A while back, Joe Williams from CoSTAR LiveLab reached out to me asking if Anukari had multichannel output support. Evidently the UK government is investing in the arts, which as an American is a pretty (literally) foreign concept. One of the labs working on promoting live performance is LiveLab, and they have a big 28-channel Ambisonic dome. Joe saw Anukari and thought it would be cool to create an instrument with 28 mics outputting to those 28 speaker channels.

I'd received several requests for multichannel I/O, but hadn't yet prioritized the work. The LiveLab use case is really cool, though, and Anukari will be featured in a public exhibit later this month, so I decided to prioritize the multichannel work.

Anukari now supports 50x50 input/output channels. In the standalone app, this is really simple, you just enable however many channels your interface supports and then inside Anukari you assign each audio input exciter or mic to the channels you want.

It also works for the plugin, but how you utilize multichannel I/O is very DAW-dependent. Testing the new feature was kind of a pain in the butt, because I have about 15 DAWs for testing, and multichannel is a bit of an advanced feature, so I ended up watching a zillion tutorial videos. Every DAW approaches it a bit differently, and the UX is generally somewhat buried since it's a nice feature. But it works everywhere, and it is extremely cool to be able to map a bunch of mics to their own DAW tracks and give them independent effects chains and so on.

Behind the scenes, it was really important to me that the multichannel support did not impact performance, especially when it was not in use. I'm very happy to say I achieved this goal. When you're not using multichannel I/O, there is zero performance impact. And even in 50x50 mode the impact is very low. Anukari is well-suited for multichannel I/O since each mic is tapping into the same physics simulation at different points/angles, so none of the physics computations have to be repeated/duplicated. Really the only overhead is copying additional buffers into and out of the GPU. On the Windows CUDA backend, that's a single DMA memcpy, which is very fast. And on the macOS Metal backend, it's unified memory, so no overhead at all. All that remains is the CPU-CPU copy into the DAW audio buffers, which is very, very fast.

I look forward to posting about the LiveLab exhibit once it happens.

ASIO Support

It's a pretty big oversight that the Windows version of the Anukari Beta launched without ASIO support. I'm not quite sure how I missed this important feature, but I've added it now.

I think I always assumed it was there, but when using JUCE the ASIO support is not enabled by default because you need to get a countersigned agreement from Steinberg to use their headers to integrate with ASIO. I already had a signed agreement with them for the VST3 support, but ASIO is a completely separate legal agreement and so I went through the steps to get that as well.

ASIO support makes the standalone app perform much better (in terms of latency) for people with ASIO compatible sound interfaces.

AMD Radeon Crashes

Officially speaking, Anukari explicitly does not support AMD Radeon hardware. This is a bit of a long story, which at some point I will write about in more detail. But the short version is that the Radeon drivers are incredibly inconsistent across the Radeon hardware lineup, which makes it extremely difficult to offer full support. For some Radeon users, Anukari works perfectly, and for others it is unstable, glitchy, or crashes, in many different unique ways.

The story I'll write about for this devlog entry, though, is the extremely frustrating case that I solved for users that have both an AMD Radeon and an NVIDIA graphics card in the same machine. This is actually a common situation, because many (most? all?) AMD Ryzen CPUs include integrated Radeon graphics on the CPU die. So for example there are a lot of laptops that come with an NVIDIA graphics card, but also have a sort of "vestigial" Radeon in the CPU that is normally not used for anything.

In the past, Anukari just worked for users with this configuration, since when it detects multiple possible GPUs to use for the simulation, it would automatically select the CUDA one as the default. However in the 0.9.6 release, Anukari began crashing instantly at startup for these users.

This was pretty confusing, because I have comprehensive fuzz and golden tests that exercise all the physics backends (CUDA, OpenCL, Metal). These tests abuse the simulation to an extreme extent, and I run them under various debugging/lint tools to make sure that there are no GPU memory errors, etc. And across my NVIDIA, macOS, and Intel Iris chips, they all work perfectly.

Luckily I had a user who was extremely generous with their time to help me debug the issue. I sent them instrumented Anukari binaries, and eventually was able to pinpoint that it was crashing inside the clBuildProgram() call.

Now, you might think that what I mean is that clBuildProgram() was returning an error code, and I was somehow not handling it. No, Anukari is extremely robust about error checking. I mean it was crashing inside the kernel and clBuildProgram() was not returning at all due to the process aborting. This is with perfectly valid arguments to the function. So, obviously, this is a horrible bug in the AMD drivers. Even if the textual content of the kernel has e.g. a syntax error, clearly clBuildProgram() should return an error code rather than crash.

The really fun part is that I've only seen this crash on the hardware identifying as gfx90c. On other Radeons, this does not happen (though some of them fail in other ways). This is what I mean about the AMD drivers being extremely inconsistent.

Now, as to why this crash happened at startup, it's because during device discovery Anukari was compiling the physics kernel, and any device where compilation failed would be assumed incompatible and omitted from the list of possible backends. I added this feature after encountering other broken OpenCL implementations like the Microsoft OpenCL™, OpenGL®, and Vulkan® Compatibility Pack which is an absolute disaster.

So the workaround for now is that Anukari no longer does a test compilation to detect bad backends. This resolves the issue, although if the user manually chooses the Radeon backend on gfx90c it will unrecoverably crash Anukari.

Longer-term, given the Radeon driver bugs, I doubt I'll ever be able to fully support gfx90c, but I ordered a cheap used laptop off eBay with that chip in it so that I can at least narrow down what OpenCL code is causing the driver to crash. I know that it's something the driver doesn't like about the OpenCL code because it did not always crash, and the only difference in the meantime has been some improvements to that code. Hopefully I can find a workaround to avoid the driver bug, but if not I might add a rule in Anukari to ignore all gfx90c chips.

(Side-note: actually the first used laptop with a gfx90c chip that I bought off eBay was bluescreening at boot, so I had to buy a second one. These inexpensive Radeon laptops are really bad.)

Not all hope is lost for Radeon support. I recently upgraded my main development machine, and the Ryzen CPU I bought has an on-die Radeon, and it works flawlessly with Anukari. So maybe what I will be able to do one day is create an allow-list for Radeon devices that work correctly without driver issues. Sigh. It is so much easier with NVIDIA and Apple.

Parameter Randomization

Unlike the features above, this one hasn't been released yet, but I recently completed work to allow parameters to be randomized.

For Anukari this turned out to be a bit of a design challenge, since the sliders that are used to edit parameters are a bit complex already. The tricky bit is that if the user has a bunch of entities selected, the slider edits them all. And if the parameter values for each entity vary, the slider turns into a "range editor" which can stretch/squeeze/slide the range of values.

So the randomize button needs to handle both the "every selected object has the same parameter value" and "the parameter varies" scenarios. For the first scenario with a singleton value, it's simple: pressing the button just picks a random value across the full range of the parameter and assigns it to all the objects.

But for the "range editor" scenario, what you really want is for the randomize button to pick different random values for each entity, within the range that you have chosen. There's one tricky issue here, which is that it is very normal for the user to want to mash the randomize button repeatedly until they get a result they like. This will result in the range of values shrinking each time (since it's very unlikely that the new random values will have the same range as before, and the range can only be smaller)!

So the slider needs to remember the original range when the user started mashing the randomize button, and to reuse that original range for each randomization. This allows button mashing without having the range shrink to nothing. It's important, though, that this remembered range is forgotten when the user adjusts the slider manually, so that they can choose a new range to randomize within.

Another kind of weird case is when the slider is currently in singleton mode, meaning that all the entities have the same parameter value, and the user wants to spread them out randomly over a range. This could be done by deselecting the group of entities, selecting just one of them, changing its value, then reselecting the whole group, which would put the slider into range mode. But that's awfully annoying.

I ended up adding a feature where you can now right click on a singleton slider, and it will automatically be split into a range slider. The lower/upper values for the range will be just slightly below/above the singleton value, and the values will be randomly distributed inside that range. So now you can just right click to split, adjust the range, and mash the randomize button.

by Evan at 7/14/2025, 8:54:03 PMgui ux gpu radeon multichannel bug