MachineLICM can hoist an invariant load, but if that load is folded it needs to be unfolded. On AVX512 sometimes this load is an broadcast load which we were previously unable to unfold. This patch adds initial support for that with a very basic list of supported instructions as a starting point.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
- Build Status
Buildable 37604 Build 37603: arc lint + arc unit
Event Timeline
Does this break the X86FoldTablesEmitter in anyway?
In the future do you think we could we add something similar to support scalar ops and ops where memory size != register size?
llvm/lib/Target/X86/X86InstrFoldTables.cpp | ||
---|---|---|
5248 | Do masked ops work? Possibly add a few to this initial test? | |
llvm/lib/Target/X86/X86InstrFoldTables.h | ||
44–46 | I'm happy for the bit adjustments to go in straight away (without the new broadcast bits - these need to stay in this patch). |
I don't think it will break the X86FoldTablesEmitter. The emitter just won't generate the new table.
Yes I think we could support scalar ops in the future.
llvm/lib/Target/X86/X86InstrFoldTables.cpp | ||
---|---|---|
5248 | I think if the mask was dynamically all 0 then any memory fault would be masked. So I think if we unfold it, we would need to generate a masked broadcast as well to maintain the fault suppression. Which wouldn't be eligible for hoisting. Though we don't fold masked operations in the first place so unfolding without applying the mask would be fine today, but I wouldn't want it to break in the future. |
This was done to get a free bit for the TB_FOLDED_BCAST flag