Trees often involve extract i8 / i16s, and ultimate sources may come from extract i32s. Capturing trees that involve e.g. SIGN_EXTEND has limited impact on combining into perms, but significantly helps for combining trees for arithmetic ops (e.g. v_dot4)
Details
Details
Diff Detail
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Unit Tests
Event Timeline
Comment Actions
Address comments + remove redundant ValueSize checks and handling
llvm/lib/Target/AMDGPU/SIISelLowering.cpp | ||
---|---|---|
10440 | No problem -- I would think it is correct as we only ever take the non extension bits. Either way, I'll look at it separately. |
llvm/lib/Target/AMDGPU/SIISelLowering.cpp | ||
---|---|---|
10919–10922 | this could probably be just return the getBitcast | |
10929–10931 | ditto |
handling any_extend here seems dangerous, can you do that in a separate patch? It might be correct but I'd rather keep it separate