Due to naming, it appears we don't actually try to select these instructions. In general, selection for dot instructions (specifically the dot4 variants) is a bit fragile, and I plan to rework lowering via combining to make selection more robust. For now, we should at least try to select them.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
We used to pattern match all the dot operations, but stopped because of a ridiculous blow up in compile time. Have you tried measuring that?
Also look at the generated selection tables. This shouldn't be one of the first patterns tried
I'll take a look.
I have not directly measured compile time of this patch, but I will. I was thinking of deleting these patterns as they are supported by DAGCombining -- which shouldn't be too expensive with early exits on missed V_MUL_*24 operands. Worth mentioning is that there is also a huge compile time cost for not selecting into these instructions when we should -- we are seeing kernels getting stuck in RA for hours due to code bloat.
Will abandon if https://reviews.llvm.org/D155995 supersedes selection of these instructions.
This is better than the combiner - if it doesn't completely blow up compile time. It's probably easier to avoid the compile time problems with the combine though
The main problem is the amount of permutations that can occur in extract code. I haven't tired implementing in TableGen and measuring, but I would assume that trying to pattern match them all with tablegen would blow up compile time.