devlog > gui

Digging into usability

Captain's Log: Stardate 78422.5

Recently I've started to feel pretty good about Anukari's reliability and compatibility with a broad cross-section DAWs. Having 12+ DAWs set up to test against has helped, and the chaos monkey has also really started to give me confidence that it's getting harder to crash the plugin. And performance is also looking pretty good on many machines. So finally I'm feeling like I can spend some time working on making it more usable. This is something I've wanted to work on for quite a long time, but when a piece of software is crashing a lot, it's hard to argue that the crashes themselves aren't the biggest usability issue, so that comes first.

The Entity Palette

The first big improvement I made this week was adding the "entity palette," which is a horizontal menu at the bottom of the 3D view that presents little thumbnails of all the different kinds of entities that you can create, allowing them to be dragged into the 3D world to place them. Previously you had to either use the right-click context menu or hotkeys to create entities, and that only allowed the creation of top-level entity types. So for example, to create an Oscillator exciter, you had to create a Mallet exciter and then change its type to Oscillator. Now, you can just scroll over to the little picture of an Oscillator and drag it into the world. Here's what it looks like:

The thumbnails are a huge help in terms of figuring out what you want. They're way easier to work with than just a textual list. They are automatically rendered based on the actual 3D skin that you've selected, so the visuals you see in the palette match what you'll actually get in the 3D world. This made it substantially more complex to build, but I feel that it was worth it so that the visuals are consistent. Also, I think it's just a nice touch.

Another really helpful part of the palette is that it provides a place to add tooltips for all of the entities. Anukari's interface has problems with discoverability; there is a lot of complex stuff you can do, but historically there hasn't been any way to learn about it, other than perhaps watching a tutorial video. But now you can go through the palette and mouse over any of the entities and get a quick idea what it does and how it functions. In particular, the tooltips discuss how entities need to be connected to one another, which is a bit of a tricky issue which calls for another solution.

Connection Highlighting

In Anukari, entities that interact with one another have to be explicitly connected via springs or various kinds of "links." It's not immediately obvious how this works; for example, springs can only be connected to Bodies and Anchors. Exciters can be connected to induce vibration in free Bodies but not Anchors, but also they can be connected to Modulators to have their parameters modulated. Even more complex systems exist, such as Delay Lines, which connect to entities that operate on waveforms.

Previously, the only way to learn what kinds of connections were possible was to select an entity, choose the "connect" command (or hotkey), and when you dragged the resulting link around it would only snap to the entities that were valid connections.

The new system that I've implemented is much better. What happens is that whenever you are placing a link, all of the entities that you can connect it to are automatically highlighted. And the highlighting color is color-coded based on what kind of link you'd get if you were to make the connection. And the really cool thing is that with the entity palette, you can drag in a specific kind of link, and immediately see what you can do with it. This makes the link system dramatically more discoverable.

Here's an example of the highlighting in action. In this video I drag a spring from the palette into the world. Immediately it highlights all the entities that it could be attached to, which is all Bodies and Anchors. I drop it on an Anchor, and the set of highlighted entities changes: Anchors can't be connected to other Anchors, so only the Bodies are highlighted.

Discoverable Camera Controls

For a long time it's irked me that the mouse/keyboard controls used to move the 3D camera around are so obscure. It's kind of an unavoidable problem to some extent, because between zoom, x/y rotation, and x/y/z pan, the user has control over 6 degrees of freedom, and a touchpad only has 2 degrees of freedom. Gestures help with this a bit, for example two-finger scroll adds a degree of freedom. And on a mouse you can use the wheel for that. But ultimately it's difficult to avoid the use of modifier keys to access all the camera dimensions.

During the pre-alpha, I've had to point users to a document that lists all the camera controls. I'm pretty sure that this is super not-fun. So it's been a goal for a long time to make the controls discoverable in the plugin itself. But I wanted something better than just a help menu that parrots what's in the doc.

Finally what I ended up with is a system that's loosely based on a feature in Blender (3D modeling software that has to solve this problem): there are now icons hovering over the 3D viewport that you can drag with the mouse (or touchpad) to adjust the camera. So there's a zoom icon, a rotate icon, and a pan icon. Each of them highlights on mouseover, and changes the cursor to the open hand, and when you click changes to a grabby hand. So it's fairly obvious that you can interact with them by dragging, and then once you try it, it's quite obvious what it does. The 2D icon graphics need improvement, but they're pretty communicative at this point.

I think for a lot of users, dragging these icons will be a fine way to control the camera without any further learning. But for power users the icons serve an additional purpose: like the entity palette, they provide a place to put tooltips, and in the tooltips, the hotkeys are explained for advanced users:

There's still a lot of work to do to make Anukari easy to use. One thing that I've been experimenting with is how to show that snap-to-grid is enabled in the 3D view, since this has been a source of confusion in the past (a user created new entities and was confused by their placement -- it was because they forgot that snap-to-grid was on). I've tried a few kinds of visuals but so far haven't found anything that is helpful and also looks good. Another thing I want to do is to automatically highlight buttons in the GUI when certain events occur, to hint to the user what to do next. For example, the circuit breaker feature can optionally pause the simulation if it detects a physics explosion. It pops up a message explaining this, but I'd also like the button that resets the simulation to pulse/highlight in some way, so that the user will immediately see what to do.

But even with these few improvements, I'm starting to be quite optimistic that I'll be able to make the experience for new users pretty fun.

The chaos monkey lives

In the last couple of days I finally got around to building the "chaos monkey" that I've wanted to have for a long time. The chaos monkey is a script that randomly interacts with the Anukari GUI with mouse and keyboard events, sending them rapidly and with intent to cause crashes.

I first heard about the idea of a chaos monkey from Netflix, who have a system that randomly kills datacenter jobs. This is a really good idea, because you never actually know that you have N+1 redundancy until one of the N jobs/servers/datacenters actually goes down. Too many times I have seen systems that supposedly had N+1 redundancy die when just one cluster failed, because nobody had tested this, and surprise, the configuration somehow actually depends on all the clusters being up. Netflix has the chaos monkey, and at Google we had DiRT testing, where we simulated things like datacenter failures on a regular basis.

But the "monkey" concept goes back to 1983 with Apple testing MacPaint. Wikipedia claims that the Apple Macintosh didn't have enough resources to do much testing, so Steve Capps wrote the Monkey program which automatically generated random mouse and keyboard inputs. I read a little bit about the original Monkey and it's funny how little has changed since then. They had the problem that it only ran for around 20 minutes at first, because it would always end up finding the application quit menu. I had the same problem, and Anukari now has a "monkey mode" which disables a few things like the quit menu, but also dangerous things like saving files, etc.

The Anukari chaos monkey is decently sophisticated at this point. It generates all kinds of random mouse and keyboard inputs, including weird horrible stuff like randomly adding modifiers and pressing keys during a mouse drag. It knows how to move and resize the window (since resizing has been a source of crashes in the past). It knows about all the hotkeys that Anukari supports, and presses them all rapidly. I really hate watching it work because it's just torturing the software.

The chaos monkey has already found a couple of crashes and several less painful bugs, which I have fixed. One of the crashes was something completely I completely didn't expect, and didn't think was possible, having to do with keyboard hotkey events deleting entities while a slider was being dragged to edit the parameters of such entities. I never would have tested this manually because I didn't think it was possible.

The chaos monkey is pretty simple. The biggest challenges were just keeping it from wreaking havoc on my workstation. I'm using pyautogui, which generates OS-level input events, meaning that the events will get sent to whatever window is active. So at the start, if Anukari crashed, the chaos monkey would start torturing e.g. VSCode or Chrome or something. It was horrible, and a couple of times it got loose and went crazy. It also figured out how to send OS-level hotkeys to open the task manager, etc.

Eventually the main safety protection I ended up implementing is that prior to each mouse or keyboard event, the script uses the win32 APIs to query the window under the mouse, and verifies that it's Anukari. There's some fiddly stuff here, like figuring out whether a window has the same process ID as Anukari (some pop-up menus don't have Anukari as a parent window), and some special handling for file browser menus, which don't even share the process ID. But overall I've gotten it to the point where I have let it run for hours on my desktop without worry.

The longest Anukari has run now with the Chaos monkey is about 10 hours with no crashes. Other things looked good too, for example, it doesn't leak memory. I have a few more ideas on how to make the chaos monkey even more likely to catch bugs, but for now I'm pretty satisfied.

Here's a quick video of the chaos monkey interacting with Anukari. Note that during the periods where the mouse isn't doing anything, it's mashing hotkeys like crazy. I'm starting to feel much more confident about Anukari's stability.

Complications from custom 3D models

While working on the new 3D model assets for Anukari, one "little" TODO that I filed away was to make the mouse interactions with the 3D objects work correctly in the presence of custom models. This includes things like mouse-picking or box-dragging to select items.

In the first revision, because the 3D models were fixed, I simply hard-coded each entity type's hitbox via a cylinder or sphere, which worked great. However, with custom 3D models this is no longer tenable. The old hitboxes did not necessarily correspond to the shape or size of the custom models. This became really obvious and annoying quite quickly as we began to change the shapes of the models.

This was one of those problems that seems simple, and spirals into something much more complicated.

My first idea was to use Google Filament's mouse picking feature, which renders entity IDs to a hidden buffer and then samples that buffer to determine which entity was clicked. This has the advantage of being pixel-perfect, but it uses GPU resources to render the ID buffer, and it also requires the GUI thread to wait a couple frames for the Renderer thread to do the rendering and picking. Furthermore, it doesn't help at all with box-dragging, as this method is not capable of selecting entities that are visually obscured by closer entities. But in the end, the real killer for this approach was the requirement for the GUI thread to wait on the Renderer thread to complete a frame or two. This is doable, but architecturally problematic, due to the plans that I have to simplify the mutex situation for the underlying data model -- but that's a story for another day.

My second idea was to write some simple code to read the model geometry and approximate it with a sphere, box, or cylinder, and use my existing intersection code based on that shape. But I really, really don't want to find myself rewriting the mouse-picking code again in 3 months, and I decided that this approach just isn't good enough -- some 3D models would have clickable areas that were substantially different from their visual profile.

So finally I decided to just bite the bullet and use the Bullet Physics library for collision handling. It supports raycasting, which I use for mouse-picking, and then I use generalized convex collision detection for frustum picking. The documentation for Bullet sucks really hard, but with some help from ChatGPT it wasn't too bad to get up and running. The code now approximates the 3D model geometry with a simplified 42-dimensional convex hull, which is extremely fast for the collision methods I need, and approximates even weird shapes quite well (I tried using the full un-approximated 3D geometry for pixel-perfect picking, but it was too slow). I'm very happy with the results, and it seems that pretty much any 3D model someone can come up with will work well with mouse interactions.

The things that made this a week-long job rather than a 2-day job were the ancillary complications. The main issue is that while the old hard-coded hitboxes were fixed at compile-time, the new convex hull hitboxes are only known by the Renderer thread, and can dynamically change when the user changes the 3D skin preset. This introduced weird dependencies between parts of the codebase that formerly did not depend on the Renderer. I ended up solving this problem by creating an abstract EntityPicker interface which the Renderer implements, so at least the new dependencies are only on that interface rather than the Renderer itself.

An example here is when the user copies and pastes a group of entities. The data model code that does this has a bunch of interesting logic to figure out where the new entities should go, in order to avoid overlapping them with any existing entities. It's a tricky problem because we want them to go as close as possible to where the user is looking, but have to progressively fall back to worse locations if the best locations are not available. Anyway, this requires being able to query the AABBs of existing entities, which is now geometry-dependent.

Another example is when creating new entities. This is similar to copying and pasting, but the requirement to have the entity go near where the user clicked the mouse is less flexible. A final example is rotating entities, where the rotational sphere radius needs to be known, as well as the desired diameter for the "rotation cage' that appears around the entity to indicate it's being rotated.

Anyway, it took a few days but finally I think I have all of these use-cases working correctly. Fortunately I had unit tests for all this stuff, so that helped a lot. This is a pretty nice milestone, since I think this is the last "heavy lift" for the new 3D model configurability feature.

As usual, there are still a few fiddly details that I need to address. The biggest one is that it's a little slow when you delete 1,000 entities. This is an edge case, but it is noticeable and irritates me. I think I know what I'll do to speed it up, but we'll see.

Loading...

© 2024 Anukari LLC, All Rights Reserved
Contact Us|Legal