The irony of this patch is that the one CPU that is affected is AMD Jaguar, and Jaguar has a completely double-pumped AVX implementation. But getting the cost model to reflect that is a much bigger problem. The small goal here is simply to improve on the lie that !AVX2 == SandyBridge.
Details
Details
Diff Detail
Diff Detail
Event Timeline
Comment Actions
This seems reasonable to me also. It might be nice to have a separate X86Subtarget property for double pumped 32-byte load/stores, but that may be overkill for this one use.
Comment Actions
Thanks, Zia and Dave. I agree a separate property would be good if we really want to model this better. For reference, I noticed this bug as part of the TTI cost model discussion in PR26837:
https://llvm.org/bugs/show_bug.cgi?id=26837
There's a lot of bigger stuff that could be modeled better. :)
Ideally, I think we would lift latency/throughput data from the SchedMachineModel, but I'm not sure how to do that yet.