- User Since
- Jul 30 2013, 7:58 PM (285 w, 4 d)
It uses current rounding mode for inexact conversions. cvtsi642ss should do the same. As does (u)dq2ps and (u)qq2pd. I think we use sitofp/uitofp for some lengths of those already.
Fri, Jan 18
Hopefully fix https://github.com/ClangBuiltLinux/linux/issues/320#issuecomment-455435791. Had to exclude CallBr predecessors in JumpThreading.
Fix most of my own review comments. Fix what was probably a bug in SimplifyCFG.
Thu, Jan 17
Update to include tests
If we return false like we currently do via the loop. Then I don't think it can be tested since its really just making explicit what was previously an odd iteration behavior. If we change it to true then maybe that would be a behavior change we could see.
Add CallBr to isSafeToSpeculativelyExecute to keep isValidAssumeForContext from walking off the end of the basic block.
Wed, Jan 16
Fix LangRef typos
Merge request r40343
There are a lot of test changes here that have nothing to do with the add/sub operations mentioned in the title. How do we know these changes are good? Was any benchmarking done with this patch?
Add assert for SSE4.1
Tue, Jan 15
Add some documentation to the LangRef. Probably needs additional work.
Move patterns into avx2_var_shift. Add multiclass in AVX512 to wrap the vector 3 vector lengths.
Commandeering so I can rebase the tests in the interest of getting this through review quicker.
Yeah that’s the issue. The software interface for the intrinsics passes base, index, and scale separately. The generic intrinsics expect a GEP they can extract the info from. I suppose with the right casting we could form a GEP, but that seems ugly.
Mon, Jan 14
I've committed the tests along with some formatting fixes, removal of unused operands, and a fix to minimize the diffs in avx-intrinsics-x86.ll. Can you rebase this patch and regenerate the test checks?
Remove guard from include of xsaveintrin.h. We can't have any check because we need it to always include on non-windows platforms due to target attribute.
On second thought. I'm proposing we just nuke this backend all together.
Do you need me to commit this?
Address some review comments. Include context this time
Sat, Jan 12
Fix accidental renaming of the tests that prevented the script from working
Add back the check lines. Make a copy of the test for the old intrinsics.
Oops that was a mistake. I meant to regenerate after using the autoupgrade code to get the new IR out of opt.
Fri, Jan 11
Attempt to fix a bug reported by @nickdesaulniers . We weren't using the label created for the blockaddress of the basic block when we printed the inline assembly block. So there was no guarantee the label we were using wouldn't be removed. To fix this I've kept the blockaddress object in the SelectionDAG instead of stripping it in the builder. I've created it directly as TargetBlockAddress to hide it from isel. Then in the printer I look for the operand being a blockaddress and grab the symbol for it. I had to make a few checks for TargetBlockAddress in other places to make this work. I admit I have no idea if this is the right thing to do at all, and open to any input on this.
Thu, Jan 10
Rebased on top of CallBase. This builds and passes lit tests.
How do you get your target to be able to type legalize an abs of any type? The interface for enabling custom legalization requires calling setOperation with a specific type does it not?
Wed, Jan 9
Abandoning because i sent it to llvm-commits instead of cfe-commits. Will redo to get the right mailing list
Reuse combineVSelectToShrunkBlend. This reduces the code and should allow better optimization when the select condition is used by multiple selects.
PMULDQ/PMULUDQ is interacting poorly with the fact that we convert zext/sext to zext_vector_inreg/sext_vector_inreg before type legalization. So we split the PMULDQ/PMULUDQ when we create them. Then SimplfiyDemandedbits can't optimize the zext/sext to aext because the splitting messed up the use count. Then the zext/sext becomes a split zext_invec/sext_invec, but SimplifyDemandedBit won't turn those into aext_invec. So its really a gross ordering problem that probably goes away with -x86-experimental-vector-widening-legalization since we won't eagerly create zext_invec/sext_invec ops.
There is this patch that was trying to make ABS more available D50239
Do you do custom type legalization for ABS in your target? There's no support for legalizing it in the target type independent legalizer.
Tue, Jan 8
Mon, Jan 7
Doesn't SimplifyDemandedBits already call ShrinkDemandedConstant for AND/OR/XOR? And without an override of targetShrinkDemandedConstant, ShrinkDemandedConstant only works on those 3 nodes. So is an explicit call to ShrinkDemandedConstant even needed here?
Is the zext/cmp optimization in WebAssemblyFastISel::zeroExtendToI32 really valid? Do other targets do that?
Gentle ping. Any chance this is going to get fixed before 8.0 branches?
Rebase after fixing immediates to be in range without a modulo
Sun, Jan 6
Add select checks. Move masking for variable vpshld/vpshrd to the intrinsic header. So now we don't need different builtins for mask and maskz.
Thu, Jan 3
@rupprecht, does it look related to the memptr.isvirtual bit that C++ member function pointers use?
Address review comments. Add test case.
Wed, Jan 2
It's producing more test changes now on X86 and other targets. It's producing several regressions in the combine-sdiv.ll test by failing to delete some and instructions. The other X86 changes look pretty neutral. I haven't looked at the other targets yet.