Theoretically improves compile time for targets with many overlapping
registers
Details
Diff Detail
Event Timeline
I don't know what the state of this is nowadays, but doesn't this risk problems with the old problem of XMM and YMM not being differentiated within the register units? Calling convention for windows says XMM registers are preserved but YMM register are clobbered. With this change we would likely register XMM and YMM as clobbered. I haven't thought things through here, is this pass still safe with this in mind?
I don't know anything about that, but shouldn't that be expressed with the existing constraints on the calls/returns anyway?
I don't see why this would warrant special casing, but given that no existing tests failed I assume this is fine
Fix missed test update that shows this fixes a bug where a mov was previously incorrectly deleted
Theoretically improves compile time for targets with many overlapping registers
In practice, this is a major compile time regression instead: http://llvm-compile-time-tracker.com/compare.php?from=6c44a7179f1747ec38d580e6b50bde98555ad811&to=b5041527c75de2f409aa9e2e6deba12b17834c59&stat=instructions
This doesn't make much sense. My preliminary analysis says that this is somehow phys_regs_and_masks fault (which really doesn't buy any real code simplification over iterating the operands)
It does recover most of the regression: http://llvm-compile-time-tracker.com/compare.php?from=ab56719acd98778fb2e48fa425ac7c8d27bdea86&to=d90f7cb559e32c2cbf1f9839d7e8e0cc0be189ba&stat=instructions
There is still a bit of residual regression left though: http://llvm-compile-time-tracker.com/compare.php?from=6c44a7179f1747ec38d580e6b50bde98555ad811&to=d90f7cb559e32c2cbf1f9839d7e8e0cc0be189ba&stat=instructions
I think this is explainable by LiveRegUnits::removeRegsNotPreserved being slower than clearBitsNotInMask. Ideally we would avoid this by switching from using regmasks to regunit masks