Wednesday, September 16, 2015

Day 24: On the Fence and Optimization Failure

I must say, 68K is quite the difficult thing to optimize. Every time I find a way to optimize the code, it either ends up taking longer to implement or doesn't work. Beforehand I had found trouble with trying to use saving data on the stack as a subroutine, not realizing that the rts call made storing on the stack impossible for future use.

What next? Well, when managing QuickSort on 68K, there's a problem when extending to look for lesser values than a pivot point. What if there are no lesser values? The program will eventually find something, but it will not swap when it is outside the range and past the partition. However, finding that something is uncertain and may cost clock time. Solution? Just check the value of the address to be at the beginning of the data to sort. Problem? All this constant checking actually takes MUCH longer than letting it go past that point to resolve itself. We're talking 8 bytes gets 6 cycles saved in time, and 3 KB ends up going 100,000 cycles overtime. Yeesh.

With that point, I also looked up more of DirectX device creation in DX12. An interesting addition is the fence and command lists; command lists allow a series of rendering commands to be inserted in a queue to be run and processed, and the fence allows synchronization of those rendering lists. Definitely a step up from DirectX 11, and definitely another thing to keep in mind when actually setting those rendering commands.

No comments:

Post a Comment