I think it all stems from the fact that people believe everything is documented when it comes to GPU and that GPU manufacturer have this amazing documentation that tell you exactly what to do. While they is obviously more detail documentation made by the hardware engineer to tell about all the trick the software engineer can use. There is, however something that sometimes is not always documented, hardware bugs/issues/errata pick the word you like.
Most of the hardware errata are properly documented, at least i hope so, but sometimes they are not, either because the closed source driver team never run into the issue because they are doing things sufficiently differently than us that they never end up in the same spot we do. Or simply because the engineer that figured out the issue forgot to fill an errata for all the kind of reason an human can forgot about doing something. You might object that sending mail to the engineering team and we should get answer, well may i remind you that r6xx family was release in june 2007 and by june 2007 the closed source driver was pretty much done and probably 95% of the hardware errata was fix in the driver already. That means that asking question on r6xx generation to engineer, is asking question about something that is more than 5 years old in their mind. I would not blame any of the engineer to not remember much about it.
So what happen with hyperz is a simple story. I started probably in january looking at it. First issue i stumble upon was some kind of checkboard corruption, there was an errata with exactly that symptons AMD told me about it but the solution did not help. Thus i started looking at fglrx, i capture one frame of fglrx with hyperz and try to replay it on the open source driver but still the checkboard issue. Obviously the fglrx was setting up the GPU in a different way than we did, this is what triggered the investigation into the backend setup that produced a patch couple month ago that fixed that and also gave performance improvement.
After the setup issue being clear, i got back to hyperz and stumble upon more issue than i care to remember, the patch history will probably highlight the biggest one. Again all along the way AMD provided me with all informations they had regarding issue i was facing. But no matter how much i followed the AMD documentation advice, i still run into issue. I went back to look at what fglrx was doing and of course i found several things that i believe was no were documented, such as never reset htile preloading if resetting same surface, or first depth clear can't be a fast clear because you need to initialize the htile surface. Maybe i just missread or missunderstand documentation i was provided and i apologize if so.
In the end, from a register value point of view in each use case my patch now pretty much exactly match the register value fglrx uses. Yet on some specific use case i am still hitting lockup. So i am left with little option here, either i am missing a single bit somewhere (despite my automatic command stream comparison i might still miss thing). Or the order in which you do thing matter much more that what we believe ie you need to program some register in some specific order to avoid issues. I believe this is the issue i am left with, but trying to match fglrx order means huge overhaul of how r600g build its command stream.
So the fact is, in the end the closed source driver is the reference implementation that got all the informations in it. So looking at closed source driver command stream is always the saffest way to be sure to have all the informations. That's at least my opinion.