devlog

Way more detail than you ever wanted to know about the development of the Anukari 3D Physics Synthesizer [see archive]

Getting into the usability weeds

Captain's Log: Stardate 78505.1

It's been a while since I posted! This is partly due to the holidays, but also partly due to my work recently being pretty scattered and piecemeal, so I haven't felt like there was anything super-cohesive to write about.

The main thing that I've been doing is working on Anukari's usability. Most of the big UX things I want are implemented, so now I am focusing on the little details.

My wife Meg was kind enough to do a UX study, which surfaced many small issues. I gave her only some written instructions on tasks to complete, and she was able to do nearly all of them without any help, which was pretty amazing to see, since even a few months ago I think that would have been unrealistic. I watched her carefully, and she also took notes, and this one 30 minute session resulted in a couple of dozen of UX improvement ideas.

With most of those improvements done, I decided that the next thing I should do to find UX problems is to start building the factory presets that will ship with Anukari. So far I've created about 50 presets, and it has definitely been a useful process. The high-order bit is that actually things are working pretty well. That's really nice to see. But there were a bunch of small things that, while they didn't prevent me from getting anything done, added enough friction to be mildly annoying. And when all those little papercuts are added up, they are still pretty problematic.

I won't list all the tiny UX issues here (the release notes will more-or-less cover them), but I'll talk about one of the most long-term persistent issues: physics explosions.

Physics Explosions

In any discrete physics simulation, there are going to be parameters where extreme values can cause problems. At a fundamental level, you can run into things like floating point error, and that is a real thing for Anukari, but it's mostly been easy to avoid by simply providing reasonable bounds for all the parameters (which requires thinking about how parameters interact, what kind of floating point operations are done on them, etc).

The bigger problem for Anukari has always been that there are situations where the fixed time step (one step per audio sample) is just too big for a given set of parameters, and the simulation can get into a situation where the simulation error grows with each simulation step, which will very quickly get out of control and cause a complete explosion. The introduction last year of time dilation for voice-instanced polyphony made this situation much more apparent, because higher notes require exponentially higher time steps. Each octave doubles the amount of time the simulation needs to advance for each discrete step.

I had a few theories about where things were going wrong, and investigated them pretty deeply this last week. It turns out that there's nothing discontinuous happening; i.e. no float is going to infinity or NaN. There's no overflow or underflow. Nothing that subtle is required for the explosions. It really is just that the time steps are getting too big.

For Anukari this is not really a solvable problem. The simulation uses a first-order integration, and that's already extremely computationally expensive. A while back, I did experiments to implement it as an RK4 integration, which is a fourth-order system and thus is substantially much more accurate. And it was more accurate, but the computation required to compute all the extra derivatives outweighed the accuracy benefits. It was just too slow to consider. The most reasonable option in Anukari is simply to use smaller time steps, which the user can do by setting the sample rate to something higher. This does what you might expect -- doubling the sample rate gives you an extra octave of usable simulation without explosions.

Anyway, Anukari is plenty usable within this limitation. But there is a big usability issue where you are having a great time playing an instrument, and then you happen to hit a note that's just a little too high, and the physics explode. In voice-instanced mode, only one of the voice instances will explode, but it will stay broken until you do something about it. So the moment you hit a "bad" note all the fun stops. I want to make sure that you can't just hit a bad note and break things.

I did a bunch of experiments with adding various limits inside the simulation. For example, I added a terminal velocity -- there was a hard cap on how fast a mass could move. This did indeed solve the issue of explosions, but there was a cost, which is that there were a few perfectly-stable presets that relied on very fast velocities for cool sounds. So the safety of terminal velocity would come at the cost of reduced flexibility -- the instrument would be less capable.

I tried all kinds of things to make the terminal velocity idea work. I used a smooth saturation limit instead of a hard cap, I tried terminal acceleration, terminal per-step dv/dt, etc. Most of these ideas had even bigger problems than straight velocity capping, and certainly they all reduced flexibility by imposing limits on the simulation.

There were other issues with velocity capping. When velocities were just under the cap, you could get really weird situations where a system would gain enough energy to hit the cap, get capped, slow down, build back up, and oscillate between hitting the cap and slowing down. In some cases this led to instruments that permanently rang out; damping became ineffective.

Automatic Physics Explosion Mitigation

Ultimately I decided that I didn't want a solution that limited the flexibility of the instrument. So I came up with a way to automatically mitigate explosions without capping velocity, acceleration, etc.

The solution is quite simple: over the course of simulating the physics for an audio block, the GPU simulator keeps track of the highest velocity observed for each mass. Then on the CPU side, the max velocity is compared to a very high threshold, and if any mass in a voice instance exceeds it, that voice instance is hard-reset to initial conditions and its audio is dropped for the last block. The user also gets a little toast notification about what happened.

What this means is that if you play too high of a note, from your perspective, nothing happens; it just doesn't work. The explosion gets detected immediately and the voice is reset without making any noise. This is much better than having the note explode, requiring you to do something to fix it. You just find that the range of your instrument is a bit limited.

This works extremely well, and the velocity threshold can be so high that there's no realistic instrument that would hit it. I tried some extreme examples and it does not seem possible to trigger a false positive for the explosion detection.

Now, I said that this mitigation happens without any noise. That's not entirely true. In some cases, you can hear a small click due to the discontinuous reset. This isn't a huge deal, but I want to make the factory presets work perfectly, so I also added a feature to set the MIDI note range that the instrument responds to. Thus for "perfect" presets, you build the instrument, figure out where its usable range is, and then limit it to only those notes. This isn't completely necessary, but it guarantees super stable instruments with absolutely no clicking or anything. Woohoo!

by Evan at 1/15/2025, 1:17:16 AMgui ux

newer postarchiveolder post