Bugs, GPUs, DMA, and pinned memory

Captain's Log: Stardate 77513.9

Fixing the delay line bug from yesterday took way too long. I'm still not satisfied that I fully understood the bug, but it's definitely fixed. There were two independent bugs causing similar behavior, and one of them was easy to understand. But when I fixed it, it fixed the other bug as well. I haven't figured out if it truly fixed the other bug, or just covered it up / hid it, which bothers me. But I might just file this away as "investigate later" since things are working perfectly.

Sadly since fixing this bug took so long, I still have a couple small features to finish up before doing the demo video. I'm pretty sure, though, that I can get those done tomorrow morning and make a demo in the afternoon. We'll see... I have noticed that posting demo videos on Friday gets a lot fewer views on YouTube than, say, Tuesday or Wednesday. So even if I make the video I might hold it back for a few days. 🙂

Captain's Log: Supplemental I figured out the second bug. To write external audio input to the GPU, I was using CL_WRITE_MAP_INVALIDATE_REGION on mapped memory. It writes to the same buffer that's used for internal audio signals (the sample buffers are unified for all delay signals).

The problem is that, due to the way the sample buffer is strided, the mapped memory included both the parts that external audio is written to, as well as parts that contain internal audio data. And what CL_WRITE_MAP_INVALIDATE_REGION does as an optimization is it assumes that the CPU will rewrite all of the mapped data, so it doesn't bother to DMA the GPU memory for the mapped region into the CPU memory, since it's just going to be overwritten.

Only... I wasn't overwriting it all. Since I'm using pinned memory, the same CPU virtual memory address is being used each time I map the GPU memory. And so what happened was right before processing each audio block, old data was written to all the internal sample buffers. But you'd only see this if you set the delay to be so long that it would reach all the way backwards through the ring buffer to (almost) the beginning.

So both problems are now fixed and everything is working great. And I am no longer going to go crazy worrying about the second bug.


© 2024 Anukari LLC, All Rights Reserved
Contact Us|Legal