User Details
- User Since
- Apr 16 2015, 3:05 AM (376 w, 6 d)
Fri, Jul 1
Could you please give a brief outline of the approach?
Wed, Jun 22
In other instructions, we seem to just duplicate the handling in findBaseDefiningValueOfVector and findBaseDefiningValue. Doing so for freeze would be more consistent with the existing code.
Wed, Jun 15
The optimizer can insert new atomic memcpy/memmove calls, e.g loop-idiom-recognize would do that. Currently, when loop-idiom-recognize inserts such a call it doesn't tag it with the element type, which makes it an incorrect transform for us.
May 25 2022
Copying arrays of pointers in environments with moving garbage collector is different from copying of 'primitive' types of the same types.
Well, I understand that upstream may not take it as a 'justification'.
Apr 26 2022
Apr 19 2022
Apr 5 2022
Mar 31 2022
Mar 28 2022
Jan 24 2022
Jan 18 2022
Jan 12 2022
Dec 14 2021
Nov 24 2021
Do you mind if I put this change under an on by default cl::opt flag? This way we can temporarily turn it off downstream.
Nov 23 2021
This change broke the ability to discard checks in some cases. Consider this example:
define i1 @test1(i32 %a, i32* %b_ptr) { %b = load i32, i32* %b_ptr, align 8, !range !{i32 0, i32 2147483646} %c1 = icmp slt i32 %a, %b br i1 %c1, label %exit, label %cont.1
Nov 9 2021
I agree with Philip, widening transformation doesn't rely on deoptimizations.
Oct 19 2021
Oct 14 2021
Add a simple test case using -inline-cost-full cl::opt.
Oct 12 2021
Oct 5 2021
I also notice that the new memcmp intrinsic returns i1, which is different from libc memcmp. libc memcmp returns and int indicating which of the sides was greater. Looks like, the new intrinsic matches with bcmp semantic.
Since this is a change to LangRef, please, post the proposal to llvm-dev.
Aug 4 2021
There is a target-specific hook to emit code for regular (non-atomic) memcpys: EmitTargetCodeForMemcpy. Maybe we should just implement a similar hook for element-atomic copy?
I think the name of the pass is misleading. What you are doing here is you provide an inlined lowering for a memory builtin. Your current implementation has a limitation such that it only inlines gc-leaf element atomic memcpys. I don't think the fact that it is gc-leaf is critical here. This transform can be extended to handle non-GC leaf operations. It can also be extended to handle non-atomic operations.
Jul 1 2021
Looks good.
Jun 30 2021
Jun 18 2021
FYI, I tried to take this patch for performance verification, but it crashed in smoke testing. I haven't yet looked why.
Jun 8 2021
Jun 3 2021
Mar 24 2021
I looked at the GC part and it looks good to me.
Nov 19 2020
Nov 17 2020
Nov 11 2020
Nov 10 2020
Nov 9 2020
Can we split restructuring and generalization (possibly adding a test for the generalization part)? It's hard to assess correctness when you combine the two in one change.
Oct 26 2020
Accepting this change to unblock the progress.
Oct 25 2020
Oct 23 2020
Oct 22 2020
Oct 21 2020
Address review comments.
Oct 20 2020
Still digging through the main logic. In the meantime see some minor comments inline.
Oct 19 2020
Oct 16 2020
Oct 12 2020
Oct 7 2020
Added a note into the doc that a GC parseable copy operation is not required to take a safepoint.
Oct 6 2020
Currently, if we have a loop with a safepoint poll it is not converted into a memcpy/memmove. This is because the safepoint has read semantic and prevents LoopIdiomRecognize from performing the transform. In theory we can have a transform which recognizes loops with safepoints and converts them to non-leaf memcpy/memmove. It will be up to this new transform to figure out the legality and interactions with the runtime requirements.
Oct 5 2020
This was a draft. Abandoned in favor of D88861.
Sep 18 2020
Jun 19 2020
@mtrofin, I think it makes sense to have these tests available in all configurations. We have a bunch of printer passes which are not conditioned by NDEBUG.
Jun 17 2020
Look good.
Jun 3 2020
Apr 16 2020
Apr 9 2020
Apr 7 2020
Can you please elaborate what is the issue you are solving? May be give an example?
Feb 27 2020
Feb 26 2020
Demonstrating an improvement accuracy due to PHI translation turned out to be a bit tricky.
Feb 21 2020
I reworked the patch using PHITransAddr. Now instead of tracking the backedge flag I keep track of the address to check and translate the address when needed.
Couple of nit comments inlined, LGTM once addressed.
Feb 20 2020
Feb 14 2020
Feb 12 2020
LGTM with the renaming addressed.
Ping?
Jan 31 2020
- Comment cleanup was landed separately
- Fixed the issue mentioned in https://reviews.llvm.org/D68006#inline-616849
- Added more tests
Dec 16 2019
Nov 21 2019
Nov 11 2019
Nov 8 2019
Nov 7 2019
Yeah, this limitation seems arbitrary and overly conservative.
Can we handle the case with inbounds offsets only first and extend with non-inbounds support in a follow up change? This way in the first change you'll not need to change IR interfaces at all.
Nov 6 2019
LGTM. Maybe add a test when the WC has one use but the branch condition is used in multiple branches.