devlog > macos

Captain's Log: Stardate 78203.4

Today I got the new rendering approach working on MacOS, using an NSView that hovers over the main editor window. It works just like on Windows, with the right click pop-up menu correctly displaying on top (via another NSView). I have fully weeded-out OpenGL from the app, using Vulkan on Windows and Metal on MacOS. That means that I'm no longer using any APIs that Apple has deprecated, which was a big blocker for releasing the production version of the app.

The renderer currently does all the camera operations correctly, so you can zoom, rotate, use orthographic views, etc, just like with the old renderer, and it all works. However none of the entities are displayed -- it just loads the "broken helmet" glTF demo model and displays it. The fact that I can now load and render arbitrary glTF models is wonderful, because it means that I can now hire an artist for the 3D assets and get them exactly how I want them. With my custom renderer this would have been a lot trickier, since the artist would have to understand my formats.

Next I need to convert all my existing .obj models to .glTF, load them in Filament, instance them, and translate/rotate them into their correct positions for display. The other thing I need to do is rework the parts of the GUI that hover over the 3D window, since that's no longer possible (except via native windows, which have to be square). Both of these things are fairly straightforward, but may take a bit of time to get right.

Now the bad news: running the renderer in Metal did not fix the MacOS audio performance issues. This means that there's something really funny happening, because when I run the app in headless mode for golden tests, it performs much better. And it still performs poorly in GUI mode even if I disable the 3D renderer entirely, so it's not the 3D graphics interfering with the audio. I'm thinking the OS may have some weird heuristics about what kinds of processes to prioritize for GPU compute. So this is still an open area of investigation.

Captain's Log: Stardate 78195.4

Today I finished tidying up a few loose ends from the work I did to allow multiple simulation backends (OpenCL, Metal, eventually CUDA). The main thing here was to parameterize some of the unit tests, such as the fuzz test, so that they would run against all available backends on each OS. I haven't parameterized the golden tests yet, but that's something I'll definitely do at some point.

After that, I continued work on optimizing the Metal backend. I have some changes that look fairly promising when I run isolated benchmarks, but then when running the full app the performance gains don't appear. This is interesting.

Right now my best guess for what's going on is that the MacOS OpenGL implementation is doing weird/bad stuff behind the scenes. On Windows I've established that the 3D graphics don't interfere in any measurable way with the audio thread's use of the GPU. But on MacOS there does seem to be interference. But it's not related to how much computation is happening -- the interference appears to be there even if Anukari doesn't actually draw any pixels. This is what makes me think that Apple's OpenGL implementation is bad.

So I'd like to rule out weird OpenGL issues as the cause for MacOS slowness. Since I eventually need to port the graphics to Metal, I am going to begin work on that now. There's no guarantee it helps with audio performance, but it might, and anyway I have to do it. Thus today I began integrating with the Google Filament library that I'm planning to use for cross-platform graphics.

Captain's Log: Stardate 78146.3

Surprisingly, today I got Anukari running on Metal. It turned out that modifying the OpenCL code so that it could be run via OpenCL or Metal was a lot simpler than I expected. The macros are not all that complicated, and the code is certainly uglier in some places, but for the most part it's not too bad. It took me a while to figure out how Metal does indexing for kernel arguments (device memory and threadgroup memory have different index spaces, for example), but that was the worst of it.

It works well enough to pass basically all of the golden tests. Which is very surprising. Actually it fails a few, but they're the same few that the OpenCL implementation on MacOS fails -- for whatever reason they are extra sensitive to whatever differences there are between the M1 chip and my NVIDIA chip on Windows. So from an audio correctness standpoint, things seem to be working.

I don't yet have a good read on the performance. I slapped the rough draft implementation together very quickly, and didn't take the time yet to read through the memory allocation/ownership rules that Cocoa / Cocoa Touch use, which means that my implementation leaks memory like a sieve which is causing lots of issues. I suspect that there are a bunch of other small things I've done wrong that affect performance as well.

But from what I've seen so far, I don't think a straight port to Metal will automatically answer my performance prayers. I'll have to get it working properly, and then start experimenting with how I might be able to take better advantage of what Metal has to offer. And hopefully the instrumentation/profiling tools will work a lot better to help me with that.

Loading...

© 2024 Anukari LLC, All Rights Reserved
Contact Us|Legal