- User Since
- Jul 30 2013, 7:58 PM (276 w, 6 h)
Reopening as this was reverted in r346823
Mon, Nov 12
Rebase to remove a commented out line in combineVSZext
Rebase after r346697
Sat, Nov 10
Those asserts are validating the metadata correctness. Deleting them is hiding a bug I think.
Remove the LegalOperations/LegalDAG check completel
Handle AArch64 expansion with isel patterns instead of custom lowering. This prevents DAG combine from seeing the extract+build_vector opportunity.
Address comments and hopefully arcanist adds context.
Fri, Nov 9
Thu, Nov 8
Does this change the code generated for rgbcmyk here https://godbolt.org/z/cot3xT I filed the original PR based on what happened from trying to vectorize it for sse4.2 which we don't currently do, but I think the two vblendvbs in the avx2 output are similar.
The X86 builtin story is weird. There should be 9 builtins. I'm not sure how you found 12. 8 representing the encodings used by the SSE1/SSE2 cmpps/pd/ss/sd listed below. And 9th intrinsic that takes a 5 bit immediate to cover the 32 values that the AVX vcmpps/pd/ss/sd.
Wed, Nov 7
What about subtraction?
@steven-johnson do you have an IR test case? There might be some X86 specific change I can make.
Tue, Nov 6
Mon, Nov 5
It looks like "Early Tail Duplication" may have thwarted by change for the PR.
Sat, Nov 3
Fri, Nov 2
Thu, Nov 1
Wed, Oct 31
Retitling to just the x86-64 case. 32-bit mode has issues on arguments too I think and will need more work. The IsIllegalVectorType function is a member of the X86_64ABIInfo so we need to refactor or add a new one for 32-bit.
Tue, Oct 30
Mon, Oct 29
Here's a test case that crashes with -mattr=avx2
Fri, Oct 26
Thu, Oct 25
With the correct patch and context this time
I believe this now has all the patterns we need. I haven't fully collapsed the repetition of patterns into multiclasses, but I'd like to take that as follow up.
When can we expect a signed subtraction intrinsic?
Wed, Oct 24
Not sure I understood the AVX512F comment about vcvtusi2sdq not being legal. Isn't it allowed with AVX512F in X86TargetLowering::LowerUINT_TO_FP? It can't be explicitly marked Legal with setOperationAction because that interface only mentions the input type and we still have to checkout the output type.bold text
Abandoning after an internal discussion
Abandoning after an internal conversation.
I tend to agree with you. I might look into cleaning that up after I fix the extra stuff in KNL
Tue, Oct 23
Add more patterns to handle masked logic ops. Starting to add VPTEST patterns.
Update the fast-isel code in MaterializeInt as well
Rebase and add spill-zero-x86_64.ll to cover the stack spill case that originally prompted me to look at this. I've reduced this from a complex function that has something like 20 spills of 0 in it. This is the best I was able to reduce it. We don't seem to have any directed tests for spill a 32-bit zero either. It seems to happen in a half dozen or so larger tests.
Mon, Oct 22
Rebase. Fix a bad VPTERNLOG pattern. Add a combine to combineANDNP to fix a regression.
@sanjoy @sammccall I've recommitted this in r344965 with a fix for the miscompile. I believe DAGCombiner::ForwardStoreValueToDirectLoad was forwarding a v4i64 store to a v4i32 load by replacing them with a truncate which doesn't work for vectors. We would need an extract_subvector+bitcast. I've put in a qualification to only forward scalars if the types don't match. Please let me know if you see any more issues.
DAG combine hasn't always performed legalization on the fly after LegalizeDAG. So there used to only be one chance to legalize.
@sammccall I've reverted the change in r344921. Is there anything you can do to help narrow this down? Ideally providing the LLVM IR for the failing case.
Sun, Oct 21
Thread expected width into the constant pool shuffle decoders so we don't over decode the constant.
Sat, Oct 20
Add a hack to prevent the crash in vector-trunc. Though now we miss a combine.
Add a test case for AArch64. I wasn't sure what file to put it in so I made a new file and put the diff of the old vs new code here.
Fri, Oct 19
Rebase on top of D53460