This was originally a part of D46179 but got so different to unsigned versions that it's better to have it in a separate patch.
Details
Diff Detail
- Repository
- rL LLVM
- Build Status
Buildable 21283 Build 21283: arc lint + arc unit
Event Timeline
The title of this patch includes "Constant folding" - shouldn't the code be in ConstantFolding.cpp or some descendant of that?
I remember that there was a suggestion (years ago?) to split target-specific logic like this into its own files because other targets shouldn't be burdened with code that will never affect them, but this hasn't happened yet. There's x86-specific intrinsic code in ConstantFolding.cpp added with rL123206.
I added it to InstCombineCalls.cpp after @craig.topper's suggestion to do so in order to enable adding more optimizations besides constant folding in the same place.
I'm not finding the mails where the extraction of target-specific code came up, but my vague memory says @rnk had some ideas about how to improve things.
lib/Transforms/InstCombine/InstCombineCalls.cpp | ||
---|---|---|
265 | We should really remove masking from these intrinsics and use a select. But that can be a follow up. | |
286 | Probably could use getIntegerBitWidth instead of getPrimitiveSizeInBits. Should be slightly cheaper since getPrimitiveSizeInBits has to detect that its an integer and then call getIntegerBitWidth. | |
296 | const APInt & Val0 | |
296 | Isn't it possible for the element to be a ConstantExpr? In which case this case would fail |
Implemented suggested changes.
Regarding masking with select - do you mean creating new avx512_padds/avx512_psubs intrinsics without masks and replacing the old calls with new intrinsic+select?
We should really remove masking from these intrinsics and use a select. But that can be a follow up.