- User Since
- May 11 2015, 7:59 AM (170 w, 6 d)
Thu, Aug 16
Thanks for putting the time into this, just one nit before its committed please.
Fixed and recommitted in r339858.
Re-adjusted ShAmt for big endian targets.
Thanks for reverting. The issue was that I was assuming that the instruction operands mapped to arguments for CallInsts. Will be recommitting.
Wed, Aug 15
Changed test regexesess
Tue, Aug 14
Removed the unnecessary isa<Instruction> checks and updated the test to actually test.
Rebased and fixed the handling of undef values. I've also moved the tests around so we have a standalone file for the call tests. Also added a small piece of control to decide whether we bother to promote or not: if we find nothing but sources, sinks and the icmp, then we don't bother doing anything.
Rebased and updated changes to the x86 codegen tests.
Thanks for fixing this, LGTM.
Fri, Aug 10
- Removed commented out code
- Added some TODOs
- Expanded the description of what a source and sink are
Thu, Aug 9
Moved arguments to occupy a single line.
Wed, Aug 8
Thanks for the extra tests, LGTM.
To try to make it clear that these are not user facing intrinsics, I've renamed them to arm.codegen.zeroext
Ok, from your reply on the other ticket - LGTM.
Cheers, shufflevector always confuses me. LGTM.
Thu, Aug 2
So this started life in the DAGCombiner and issues around the implementation were raised and that it would be useful to have earlier in the pipeline. But it seems that it hasn't really be thought, or discussion, about how this would fit well in the existing passes... I think DAG combine has always been the right place for this because we're trying to reuse values - something that DAGs are good for. In DAGCombiner::visitANDLike, we already handle ANDs with SRL operands and the motivating example can be addressed with very little effort:
Wed, Aug 1
And what would I need to do about testing for a generic intrinsic? Add to BitCode/compatibility-6.0.ll test and just keep this codegen one too?
Yes exactly, I would like to use these for loops. In ARMCodeGenPrepare I need to insert truncs to keep the IR legal, although I've already proved that the value is already zero extended. In those cases I want to use these intrinsics to carry that knowledge through. So it's not trying to work around the lack of info of target specific nodes. Another idea I wondered about was adding flags to instructions, but that seems far more intrusive.
Tue, Jul 31
Wed, Jul 25
This LGTM. I don't think there's a problem with solving this niche in the backend.
Tue, Jul 24
Is it such a bad idea? Sure, I would like to check whether the sel intrinsic has been used or not, but what happens in the case of inline assembly? The AAPCS is also vague, I'm not sure what a 'public interface' is in terms of an LLVM module. I'd like to have an option which is the user can be explicit in saying its fine to use these instructions.
Mon, Jul 23
Added tests for:
- non load operand to the add,
- immediate operand to the add,
- volatile store
- non-consecutive loads
Performed a rebase and added a test from a manually unrolled example. I've also added an option to control the use of the GE writing flags - really I think this should go as a subtarget feature so this can be used across this pass and ARMCodeGenPrepare.
Added another test
Now disabled at -O0
Jul 20 2018
I did some work for Thumb-2 last year in a similar vain during the combine phase, in PerformSHLSimplify, but this (unsurprisingly) doesn't handle lshr. I didn't find any headaches from changing the canonical form in those cases, so probably would be worth having it there.
All of your test cases are rooted at an or, so it makes sense to search up from there. Why not start with searching just from or (and xors?) and then add the search from more operators in later patches?
Jul 19 2018
Jul 18 2018
Ok, thanks for the clarification. I'll have a look in instcombine.
Fixed support for handling switch instructions and added another test.
This looks like an odd solution to me, I haven't seen TokenFactors used like that before. Isn't it okay for the AND to be folded? Why not just check that the AND hasn't be folded into a constant before trying to update its, now non-existent, operands?
Fixed a couple of bugs around the truncating of values into the root users. Also added explicit checks to reject signed compares.
Jul 17 2018
Now explicitly stating what values and opcodes we support. Also added some debug code around and removed the ShouldIgnore function.
Moved a couple of functions into lambdas and added a couple of checks to reject ConstantExprs.
- Fixed the really bad typo that was allowing wrapping instructions, quite confused that this wasn't caught by my existing tests... so added more of them.
- Added more type checking so we can catch the i64 issue
- Added a couple more tests to show the current limitation too
I'm happy to move it somewhere general, but where would be suitable, InstCombine?
Jul 16 2018
Jul 12 2018
I appreciate the tests here are a little lacking, I will add something for:
- volatile stores,
- non-consecutive stores,
- adding a constant,
- decrementing indvar,
- a test from manually unrolled piece of source.
Many thanks for those!
Jul 11 2018
Merging changes into D49020 since it was easier to do with a rebase.
Combined changes from D48972 and performed a rebase.
Added comments to describe the purpose of some of the tests.
Jul 10 2018
I still think you're overcomplicating things here... Keeping the ReadOnly flag, you can then just check what's writing to memory in the rest of the block:
Jul 6 2018
Added loop to destroy the MACCandidates.
Renamed BinOpSequence to BinOpChain and replaced MACCandidates with Candidates in the Alias functions.
Okay, looks fine and like ISB :) Thanks for also adding a bit of 8.3 in there too!
Cheers John! Moved, removed or localised variables and implemented getAnalysisUsage.
Jul 5 2018
LGTM. For future reference, and before committing, arm and aarch64 tests live in different codegen directories so please separate the test into the two sub-directories. Thanks!
Jul 4 2018
Jul 3 2018
- fixed typos,
- changed the names and default setting of the options, the generation of uadd and usub are now disabled by default.
- moved icmp handling code out of isSigned and into isPromotedResultSafe.
- added another target to test the default command line options.