devlog

Way more detail than you ever wanted to know about the development of the Anukari 3D Physics Synthesizer [archive]

Preparing for the open (paid) Beta

Captain's Log: Stardate 78674.7

It's been a while since I wrote a devlog update, but I promise it's not for a lack of stuff to talk about. Rather, it's the opposite: I've been so busy with Anukari work I haven't had time to write up a devlog entry!

Anukari is feeling extremely stable at this point, and I'm up to about 85 factory presets. I did a pass through all the existing presets and made sure that all of them have about 2 DAW host automation parameters, which made them dramatically more fun to play with. And, in an exciting development, I've contracted with someone awesome to make more presets, along with tutorial videos and a few other things.

With someone else working on presets and tutorials, I've been free the last week or two to work on preparing for the open Beta. This will be the first public (non-invitation) release of Anukari, which is extremely exciting. While there is still a little work left in the Anukari product itself, mostly revolving around implementing the free trial mode, adding some first-time startup screens, etc, what I've been working on recently is getting the web site ready.

The biggest thing was figuring out how I want to do payments. I toyed with a bunch of possibilities, but in the end I settled on using Shopify. The biggest part of my rationale is that, at least in the US, Shopify has become so ubiquitous that I think when I direct my customers to the Shopify checkout flow, they'll be very comfortable with it, and will probably trust Shopify with their credit card number much more than if I implemented CC# collection myself.

But also, using Shopify saves me from a bunch of other tedious work, like implementing admin dashboards for looking at order history, issuing refunds, and so on. The basic integration with Shopify really only took a couple of days.

Overall I'm happy enough with Shopify. But of course I have complaints. My biggest gripe is that Shopify has to do with Shopify's developer ecosystem. It seems to me that a large part of their success is their "partner program," which is a Shopify-supported system for software developers to offer their services to businesses that want Shopify storefronts. So far so good -- this makes a ton of sense, as most local/small businesses that use Shopify are not going to have software engineers to do this stuff.

However my gripe has to do with the app ecosystem. Shopify developers can create apps that add helpful features to a store, which shop owners can then install. So for example, there's an app to add a warning to a cart when too many of one item are added to the cart. Many of the apps are extremely simple stuff like that. Which sounds fine, right?

It is not fine. The problem is that since developers sell these Shopify apps on Shopify's app store, Shopify gets a cut. And all the apps I've seen have a recurring subscription payment. The simple 10-lines-of-code app I described above is $6/month. So, because Shopify gets a cut of $6/month, they are hugely disincentivized to add these kinds of simple features to the core platform. They are also disincentivized to make the core platform easier to use in general, because then maybe users would solve their own problems without paying $6/month to solve them.

Obviously for me this is not such a big problem, since I'm a software engineer and can mostly do these things myself. Though, I'd really prefer not to have to learn any more about Shopify's reprehensible template system than I have already. But for local/small business owners, Shopify's app store is just going to nickel-and-dime them to death. I can easily see a specialty business having to pay subscriptions for a handful of apps (many of which cost a lot more than $6/month) just to do the basic things they need.

Anyway, that's enough ranting about Shopify. I have things pretty much working, so I shouldn't complain too much.

The AAX plugin is working

Captain's Log: Stardate 78612.4

I was pleasantly surprised to find that getting Anukari working as an AAX plugin was pretty straightforward. I mean, I would have preferred if Pro Tools would just support VST3 instead of this ridiculous song and dance of having their own custom plugin format. But it could have been a lot more painful, so I shouldn't complain too much.

There were some moderately annoying logistics around getting an iLok key, and getting the right licenses/certificates from PACE to be able to sign the AAX plugin. It all felt faintly silly, because I'm already signing all the DLLs and .exe files with OS certificates on both Windows and MacOS, and one might imagine that PACE would just work with the OS certificates. But I guess PACE wants more control over things than that.

Overall my biggest takeaway from my interaction with PACE/iLok is that things are what you'd expect from an anti-piracy company that sells software: the anti-piracy seems to be the main feature and the fact that you actually receive software that does something music-related is just an accidental byproduct. I guess I can understand why pro shops like the iLok, since it allows for a uniform licensing solution with offline access, but from a consumer perspective it is massively overcomplicated.

Ranting aside, it does seem that JUCE really does deliver on the promise of seamlessly producing plugins in each of its supported formats. I don't think that I ran into any case where a problem showed up in the AU or AAX version of Anukari that was JUCE's fault, which is pretty impressive.

One nice thing about the AAX port is that it helped me find a couple of lingering bugs that also affected VST3 and AU, but we're easy to reproduce in any of the 15 DAWs I've been testing against. Pro Tools does things in a way that JUCE's APIs do document correctly, but I had just never really accounted for. So the fixes for Pro Tools also fixed some issues that I've been having trouble tracking down in other DAWs.

With the AAX support looking good, now I really need to focus on creating more presets. I think that a good body of presets is currently the biggest blocker for starting to talk to distributors, which is pretty exciting!

Getting more and more stable

Captain's Log: Stardate 78592.1

The buffer-clearing saga

Adding the new AnukariEffect plugin has ended up precipitating a lot of improvements to Anukari, because it pushed me into testing what happens when multiple instances of the plugin are running at the same time. Most of my testing is done in the standalone Anukari application. It loads extremely quickly, so it's nice for quickly iterating on a new UX change, etc. But in reality, it's likely that users will mostly use Anukari as a plugin, so obviously I need to give that configuration ample attention.

The last big issue I ran into with the plugin was that in GarageBand, loading a song that had something like 6 instances of Anukari and AnukariEffect, sometimes one of the instances would mysteriously fail. The GPU code would initialize just fine, but GPU call to process the first audio block would fail with the very helpful Metal API error, Internal Error (0000000e:Internal Error), unknown reason.

After some research, it turned out that to get a more detailed error from the Metal API, you have to explicitly enable it with MTLCommandBufferDescriptor::errorOptions, and then dig it out of the NSError.userInfo map in an obscure and esoteric manner. So I had my intern (ChatGPT) figure out how to do that and finally I got a "more detailed" error message from the Metal API: IOGPUCommandQueueErrorDomain error 14.

If you've followed my devlog for a while, it should come as no surprise that I am a bit cynical about Apple's developer documentation. So I was completely unsurprised to find that this error is not documented anywhere in Apple's official documents. Apple just doesn't do that sort of thing.

Anyway, I found various mentions of similar errors, with speculation that they were caused by invalid memory accesses, or by kernels that ran too long. I used the Metal API validation tools to check for any weird memory access and they didn't find anything weird. I figured they wouldn't, since I have some pretty abusive fuzz tests that I've run with Metal API validation enabled, and invalid memory access almost certainly would have shown up before.

So I went with the working hypothesis that the kernel was running too long and hitting some kind of GPU watchdog timer. But this was a bit confusing, since the Anukari physics simulation kernel is, for obvious reasons, designed to be extremely fast. With some careful observation and manual bisection of various code features, I realized that it was definitely not the physics kernel, but rather it was the kernel that is used to clear the GPU-internal audio sample buffer.

Some background: Anukari supports audio delay lines, and so it needs to be able to store 1 second of audio history for each Microphone that might be tapped by a delay line. To avoid allocations during real-time audio synthesis, memory is allocated up-front for the maximum number of Microphones, which is 50. But also note that there can be 50 microphones per voice instance, and there can be 16 voice instances. Long story short, the per-microphone, per-instance, per-channel buffer for 1 second of audio is about 300 MB, which is kind of huge.

It's obvious that clearing such a buffer needs to be done locally on the GPU, since transferring a bunch of zeros from the CPU to the GPU would be stupid and slow. So Anukari had a kernel that would clear the buffer at startup, or at other times when it was considered "dirty" due to various possible events (or if the user requested a physics reset).

Now imagine 6 instances of Anukari all being initialized in parallel, and each instance is trying to clear 300 MB of RAM -- that's multiple gigabytes of memory write bandwidth. And sometimes one of those kernels would get delayed or slowed enough to time out. The problem only gets worse with more instances.

Initially I considered a bunch of ideas for how to clear this memory in a more targeted way. We might clear only the memory for microphones that are actually in use. But then we have to track which microphones are live. And also, the way the memory is strided, it's not all that clear that this would help, because we'd still be touching a huge swath of memory.

I came up with a number of other schemes of increasing complexity, which was unsatisfying because complexity is basically my #1 enemy at the moment. Almost all the bugs I'm wrangling at this point have to do with things being so complex that there were corner-cases that I didn't handle.

At this point you might be asking yourself: why does all this memory need to be cleared, anyway? That's a good question, which I should have asked earlier. The simple answer is that if a new delay line is created, we want to make sure that the audio samples it reads are silent in the case that they haven't been written yet by their associated microphone. For example, at startup.

But then that raises the question: couldn't we just avoid reading those audio samples somehow? For example, by storing information about the oldest sample number for which the data in a given sample stream is valid, and consulting that low-watermark before reading the samples.

The answer is yes, we could do that instead. And in a massive face-palm moment, I realized that I had already implemented this timestamp for microphones. So in other words, the memory clearing was completely unnecessary, because the GPU code was already keeping track of the oldest valid audio sample for each stream. I think what happened is that I wrote the buffer-clearing code before the low-watermark code, and forgot to remove the buffer-clearing code. And then forgot that I wrote the low-watermark code.

Well, that's not quite the whole story. In addition to the 50 microphone streams, there are 2 streams to represent the stereo external audio input, which can also be tapped by delay lines (to inject audio into the system as an effect processor). This data did not have a low-watermark, and thus the clearing was important.

However for external audio, a low-watermark is much simpler: it's just sample number 0. This is because external audio is copied into the GPU buffer on every block, and so it never has gaps. The Microphone streams can have gaps, because a Microphone can be deleted and re-added, etc. But for external audio, the GPU code just needs to check that it's not reading anything prior to sample 0, and after that it can always assume the data is valid.

Thus ultimately the fix here was to just add 2 lines of GPU code to check the buffer access for external audio streams, and then to delete a couple hundred lines of CPU/GPU code responsible for clearing the internal buffer, marking it as dirty, etc. This resulted in a noticeable speedup for loading Anukari and completely solved the issue of unreliable initialization in the presence of multiple instances.

Pre-alpha release 0.0.13

With the last reliability bug (that I know of) solved, I was finally able to cut a new pre-alpha release this Friday. I'm super stoked about this release. It has a huge number of crash fixes, bug fixes, and usability enhancements. It also turned out to be the right time to add a few physics features that I felt were necessary before the full release. The details of what's in 0.0.13 are in the release notes and in older devlog entries, so I won't go into them here, but this release is looking pretty dang good.

The next two big things on my radar are AAX support and more factory presets. On the side I've been working to get the AAX certificates, etc., needed to release an AAX plugin, and I think that it should be pretty straightforward to get this working (famous last words). And for factory presets, I have about 50 right now but would like to release with a couple hundred. This is especially important now that I've added AnukariEffect, since only a couple of the current presets are audio effects -- most of them are instruments. So I'm kind of starting from scratch there. I think it's pretty vital to have a really great library of factory presets for both instruments and effects, and also, working on them is a great way to find issues with the plugin.

Loading...

© 2025 Anukari LLC, All Rights Reserved
Contact Us|Legal
Audio Units LogoThe Audio Units logo and the Audio Units symbol are trademarks of Apple Computer, Inc.
Steinberg VST LogoVST is a trademark of Steinberg Media Technologies GmbH, registered in Europe and other countries.