- User Since
- May 5 2014, 7:26 AM (259 w, 2 d)
Mon, Apr 22
rebase - still showing a number of regressions that are proving tricky to fix
Abandoning - the x86 improvements were handled by rL358019
Add AMDGPU srl(and(x,m),c) -> and(srl(x,c),srl(m,c)) canonicalization to improve BFE recognition
Fri, Apr 19
Thu, Apr 18
Please add context to the diff
Your test cases need to be a lot simpler - I'd recommend looking at buildvec-insertvec.ll and possibly adding your tests to that file instead of adding this new file.
Wed, Apr 17
Please add support for arm/aarch64 splat-and-multiply instructions
Some initial thoughts - I don't know a lot about the bfloat16 instructions so need to read up when I get the chance.
One last style comment from me but we need somebody better with the different ABIs to finally approve this.
A couple of minors but this looks almost ready to me, the avx512 broadcast folds are a known issue
Tue, Apr 16
Use Type::getVectorElementType()->isIntegerTy(1) - reduction types should always be vectors
Add codegen comments and use Type::isIntegerTy(1)
Mon, Apr 15
LGTM - there's too many different optimizations and canonicalizations that can occur on such a pattern to be able to match all of the permutations.
Sat, Apr 13
Fri, Apr 12
LGTM - let's leave the std::move issue for now
Removed if-else chain
Thu, Apr 11
rebase after rL358186 et al