Bug in Google Filament's Vulkan code
Captain's Log: Stardate 78236.8
Over the last couple of days I've been trying to track down some problems with the latest pre-alpha build.
Resizing Crashes
A user reported that resizing the window caused crashes on the Vulkan renderer, which was weird because I had tested that and never saw any issues. This user was on a laptop with an Intel Iris Xe GPU, though, which immediately made me suspicious since the Intel drivers have a tendency to suck. Fortunately I have a laptop with this chip, and was able to repro the issue right away.
After attaching a debugger and resizing the window, my machine didn't crash but did hang. I found that the hang was somewhere deep in Google Filament's Vulkan code, but it was pretty unclear what it was doing. Filament currently has a bug that prevents it from running in debug builds under MSVC, so I didn't have full symbol information.
I got suspicious, though, because the problem didn't seem like anything I was doing. So I ran Anukari on my main machine with an NVIDIA chip, and resized the window like crazy. Eventually it crashed. Given that it crashed quickly on the Intel chip, and took a long time on the NVIDIA chip, I guessed that maybe this was a resource exhaustion issue. Fortunately on the NVIDIA machine it didn't hang, and I got a Filament panic message about not being able to create a new swap chain.
I read the Filament code a bunch and eventually came to the conclusion that when the window is resized, and a new swap chain is created, the old one is never destroyed via vkDestroySwapchainKHR(). Thus it is leaked and we're guaranteed to eventually crash. I filed this bug.
As to why the Filament authors failed to call the destroy function, I ran into this Vulkan bug. It turns out that it's super unclear when it is safe to destroy a swap chain, to the extent that they had to add an extension to the spec to make it safe to do.
MacOS Mouse Hover
I thought to myself, well, at least things work well on MacOS. I was messing with the GUI on Mac today and discovered that tooltips weren't appearing for the buttons that hover over the 3D view. Furthermore, the hover states for those elements weren't working properly. When you mouse over them, you see the hover state for a split second but it doesn't stay. I found that mouse exit events were firing before the mouse left the window.
Initially I assumed that this had to do with my weird CAMetalLayer-backed NSView for the 3D rendering, but the issue still happens if I don't create the renderer window. I started adding instrumentation to JUCE's NSView event handling, and MacOS seems to be calling into it in a perfectly sane way -- mouseEnter and mouseExit call when I think they should.
I'm still not totally clear what's going wrong, but I have narrowed it down to the fact that JUCE is getting confused about which window the mouse is over: the child window, or the parent window that's obscured by the child window. So it alternates back and forth between the two juce::Components on each mouseMove event, and when it sees the current Component change it generates a mouseExit for the previous one.
Obviously there's a problem with how JUCE is handling the fact that the child window obscures the parent window, but I'm not yet sure how.