Page MenuHomePhabricator
Feed Advanced Search

Yesterday

alex-t added a comment to D63489: [InstSimplify] LCSSA PHIs should not be simplified away.

This causes generation of incorrect code in AMDGPU backend.

This sounds like some other check is missing elsewhere?
What happens if you feed it such an ir as-if after this transform, but manually written?
("that will result in broken asm/crashes" is hopefully not the answer)

That being said, why is LCSSAPass not sufficient?
It's already supposed to undo transforms like this.

It will result in syntactically correct asm and no crashes. In runtime we'll get incorrect result though :)
Adding LCSSA pass again later on is difficult in the sense of the pass dependencies.
So, it's better to fix the explicit bug in SimplifyPHI....

Aha, so it's not -instsimplify pass itself, but how it's used during transition into backend.

  1. You certainly don't want to make this blacklist unconditional, it should still run when the -instsimplify pass itself is run. (+instsimplify test)
  2. How does this affect other targets (backends)? Does this need some TLI hook?

In fact I still insist that this is a bug in -instsimplify pass.
The algorithm is written in such a way that it always expect more then one input in PHI node.
That means that the person who's written it had no intention to really remove the LCSSA PHIs but was just unaware of their existence!
It is not correct to remove LCSSA in similar manner as PHI nodes with equal inputs are removed.
So, any other backend should not suffer from this change. It is not about the convenience for AMDGPU backend but about correctness.
Could you provide an example where LCSSA PHIs are added and then intentionally removed?

Can you please specify why this transform is invalid from LLVM IR point of view? https://godbolt.org/z/D8gKNc
In endloop BB, which has a single predecessor BB - loop, %counter.lcssa value can only be %counter value.

Nothing is invalid from the IR point of view. This is all a kludge to get divergence information into SelectionDAG. There needs to be an IR instruction at the use point for the DAG to query the divergence analysis

Tue, Jun 18, 11:53 AM · Restricted Project
alex-t added a comment to D63489: [InstSimplify] LCSSA PHIs should not be simplified away.

This causes generation of incorrect code in AMDGPU backend.

This sounds like some other check is missing elsewhere?
What happens if you feed it such an ir as-if after this transform, but manually written?
("that will result in broken asm/crashes" is hopefully not the answer)

That being said, why is LCSSAPass not sufficient?
It's already supposed to undo transforms like this.

It will result in syntactically correct asm and no crashes. In runtime we'll get incorrect result though :)
Adding LCSSA pass again later on is difficult in the sense of the pass dependencies.
So, it's better to fix the explicit bug in SimplifyPHI....

Aha, so it's not -instsimplify pass itself, but how it's used during transition into backend.

  1. You certainly don't want to make this blacklist unconditional, it should still run when the -instsimplify pass itself is run. (+instsimplify test)
  2. How does this affect other targets (backends)? Does this need some TLI hook?

In fact I still insist that this is a bug in -instsimplify pass.
The algorithm is written in such a way that it always expect more then one input in PHI node.
That means that the person who's written it had no intention to really remove the LCSSA PHIs but was just unaware of their existence!
It is not correct to remove LCSSA in similar manner as PHI nodes with equal inputs are removed.
So, any other backend should not suffer from this change. It is not about the convenience for AMDGPU backend but about correctness.
Could you provide an example where LCSSA PHIs are added and then intentionally removed?

Can you please specify why this transform is invalid from LLVM IR point of view? https://godbolt.org/z/D8gKNc
In endloop BB, which has a single predecessor BB - loop, %counter.lcssa value can only be %counter value.

Tue, Jun 18, 11:53 AM · Restricted Project
alex-t added a comment to D63489: [InstSimplify] LCSSA PHIs should not be simplified away.

This causes generation of incorrect code in AMDGPU backend.

This sounds like some other check is missing elsewhere?
What happens if you feed it such an ir as-if after this transform, but manually written?
("that will result in broken asm/crashes" is hopefully not the answer)

That being said, why is LCSSAPass not sufficient?
It's already supposed to undo transforms like this.

It will result in syntactically correct asm and no crashes. In runtime we'll get incorrect result though :)
Adding LCSSA pass again later on is difficult in the sense of the pass dependencies.
So, it's better to fix the explicit bug in SimplifyPHI....

Aha, so it's not -instsimplify pass itself, but how it's used during transition into backend.

  1. You certainly don't want to make this blacklist unconditional, it should still run when the -instsimplify pass itself is run. (+instsimplify test)
  2. How does this affect other targets (backends)? Does this need some TLI hook?
Tue, Jun 18, 10:58 AM · Restricted Project
alex-t added a comment to D63489: [InstSimplify] LCSSA PHIs should not be simplified away.

This causes generation of incorrect code in AMDGPU backend.

This sounds like some other check is missing elsewhere?
What happens if you feed it such an ir as-if after this transform, but manually written?
("that will result in broken asm/crashes" is hopefully not the answer)

That being said, why is LCSSAPass not sufficient?
It's already supposed to undo transforms like this.

It will result in syntactically correct asm and no crashes. In runtime we'll get incorrect result though :)
Adding LCSSA pass again later on is difficult in the sense of the pass dependencies.
So, it's better to fix the explicit bug in SimplifyPHI....

Aha, so it's not -instsimplify pass itself, but how it's used during transition into backend.

  1. You certainly don't want to make this blacklist unconditional, it should still run when the -instsimplify pass itself is run. (+instsimplify test)
  2. How does this affect other targets (backends)? Does this need some TLI hook?

The other condition should probably be TTI.hasBranchDivergence().

Tue, Jun 18, 10:20 AM · Restricted Project
alex-t added a comment to D63489: [InstSimplify] LCSSA PHIs should not be simplified away.

This causes generation of incorrect code in AMDGPU backend.

This sounds like some other check is missing elsewhere?
What happens if you feed it such an ir as-if after this transform, but manually written?
("that will result in broken asm/crashes" is hopefully not the answer)

That being said, why is LCSSAPass not sufficient?
It's already supposed to undo transforms like this.

Tue, Jun 18, 7:55 AM · Restricted Project
alex-t updated the diff for D63489: [InstSimplify] LCSSA PHIs should not be simplified away.

MIR test added

Tue, Jun 18, 7:42 AM · Restricted Project
alex-t added a comment to D59990: AMDGPU. Divergence driven ISel. Assign register class for cross block values according to the divergence..

I have updated the change ttps://reviews.llvm.org/D62614 this Sunday.
The new one takes completely different approach. I'd appreciate very much If you could try it.

D62614 doesn't fix the issue.

Okay. I'm about to start partial revert of the change.
Could you please provide me test cases so that I can check if my further fixes help.

See below the good and bad outputs for one CTS failure:

GOOD: https://hastebin.com/muwuwivofu
BAD: https://hastebin.com/gofawejoku

Thanks again for looking into this.

Note that this change also breaks https://bugs.freedesktop.org/show_bug.cgi?id=110811

Tue, Jun 18, 6:17 AM · Restricted Project
alex-t created D63489: [InstSimplify] LCSSA PHIs should not be simplified away.
Tue, Jun 18, 6:13 AM · Restricted Project

Thu, Jun 6

alex-t committed rL362749: [AMDGPU] Partial revert for the ba447bae7448435c9986eece0811da1423972fdd.
[AMDGPU] Partial revert for the ba447bae7448435c9986eece0811da1423972fdd
Thu, Jun 6, 2:10 PM

Wed, Jun 5

alex-t added a comment to D59990: AMDGPU. Divergence driven ISel. Assign register class for cross block values according to the divergence..

I have updated the change ttps://reviews.llvm.org/D62614 this Sunday.
The new one takes completely different approach. I'd appreciate very much If you could try it.

D62614 doesn't fix the issue.

Wed, Jun 5, 2:07 AM · Restricted Project

Tue, Jun 4

alex-t created D62869: [AMDGPU] Partial revert for the "Assign register class for cross block values according to the divergence.".
Tue, Jun 4, 11:26 AM · Restricted Project
alex-t added a comment to D59990: AMDGPU. Divergence driven ISel. Assign register class for cross block values according to the divergence..

A number of Mesa piglit tests are also affected at least on Bonaire (but it seems to not be GPU-specific, I haven't had a chance to look at it further).

- bin/ext_transform_feedback-order elements triangles
- bin/ext_transform_feedback-order elements points
- bin/ext_transform_feedback-order elements lines
- bin/ext_transform_feedback-order arrays triangles>
- bin/ext_transform_feedback-order arrays points
- bin/ext_transform_feedback-order arrays lines
- arb_clear_buffer_object-formats (96-bit clears)

It's unclear whether the regression is caused by this particular commit or by the subsequent ASAN fix.

Tue, Jun 4, 5:13 AM · Restricted Project
alex-t added a comment to D59990: AMDGPU. Divergence driven ISel. Assign register class for cross block values according to the divergence..

Hi there,

This change introduces a regression with RADV, all dEQP-VK.subgroups.arithmetic.framebuffer.* are failing now.
Can someone look into this?
Thanks!

I have a patch that fixes the issue in another test suite. Could you please suggest how to check if it also fixes RADV?
https://reviews.llvm.org/D62614

This patch fixes the CTS failures on my side. I have just tried the latest version.

Err, only a subset is fixed actually.

Tue, Jun 4, 5:07 AM · Restricted Project

Mon, Jun 3

alex-t added inline comments to D62614: Fix for the OCL/LC to failure on some OCLPerf tests.
Mon, Jun 3, 8:01 AM · Restricted Project

Sun, Jun 2

alex-t updated the diff for D62614: Fix for the OCL/LC to failure on some OCLPerf tests.

Alternative fix.

Sun, Jun 2, 9:27 AM · Restricted Project

Fri, May 31

alex-t added a comment to D62614: Fix for the OCL/LC to failure on some OCLPerf tests.

See D60834 for what I think is the right direction to fix this.

Once again:

LCSSA is not sufficient for the following reason:
Since definition is considered uniform it is selected to SALU and produces SGPR.
The value in this SGPR is correct for all laves leave in loop body.
LCSSA PHI node that inserted in loop exit block will turn into the copy from SGPR to VGPR – this is incorrect.

Why isn't sgpr to vgpr copy correct? If it is done still inside the loop it is suboptimal but correct. When done outside of the loop it is incorrect, that's true.

Fri, May 31, 9:32 AM · Restricted Project
alex-t updated the diff for D62614: Fix for the OCL/LC to failure on some OCLPerf tests.

added the Divergent Analysis test update that was missed

Fri, May 31, 8:45 AM · Restricted Project
alex-t added a comment to D62614: Fix for the OCL/LC to failure on some OCLPerf tests.

I don't agree that the enhancement the definition of the "divergence" to the scope is correct way at all.
literally, the value is uniform if all threads observe same value. Nothing about exec mask, lanes or GPU :)
All threads in our case are all executing loop body. That's it.

Fri, May 31, 3:35 AM · Restricted Project
alex-t added a comment to D62614: Fix for the OCL/LC to failure on some OCLPerf tests.

See D60834 for what I think is the right direction to fix this.

Fri, May 31, 3:26 AM · Restricted Project

Thu, May 30

alex-t added a reviewer for D62614: Fix for the OCL/LC to failure on some OCLPerf tests: nhaehnle.
Thu, May 30, 7:02 AM · Restricted Project
alex-t added inline comments to D62614: Fix for the OCL/LC to failure on some OCLPerf tests.
Thu, May 30, 4:38 AM · Restricted Project
alex-t updated the diff for D62614: Fix for the OCL/LC to failure on some OCLPerf tests.

Added comments describing the reason for the change.

Thu, May 30, 4:38 AM · Restricted Project

Wed, May 29

alex-t added a reviewer for D62614: Fix for the OCL/LC to failure on some OCLPerf tests: rampitec.
Wed, May 29, 11:29 AM · Restricted Project
alex-t created D62614: Fix for the OCL/LC to failure on some OCLPerf tests.
Wed, May 29, 11:29 AM · Restricted Project
alex-t added a comment to D59990: AMDGPU. Divergence driven ISel. Assign register class for cross block values according to the divergence..

Hi there,

This change introduces a regression with RADV, all dEQP-VK.subgroups.arithmetic.framebuffer.* are failing now.
Can someone look into this?
Thanks!

Wed, May 29, 9:34 AM · Restricted Project

Tue, May 28

alex-t added a comment to D59990: AMDGPU. Divergence driven ISel. Assign register class for cross block values according to the divergence..

Hi there,

This change introduces a regression with RADV, all dEQP-VK.subgroups.arithmetic.framebuffer.* are failing now.
Can someone look into this?
Thanks!

Tue, May 28, 10:26 AM · Restricted Project

Mon, May 27

alex-t committed rL361776: [AMDGPU] Fix for the address sanitizer failure. Fixing typo.
[AMDGPU] Fix for the address sanitizer failure. Fixing typo
Mon, May 27, 11:15 AM
alex-t committed rL361770: [AMDGPU] Fix for the address sanitizer failure caused by the ifollowing….
[AMDGPU] Fix for the address sanitizer failure caused by the ifollowing…
Mon, May 27, 8:00 AM

Sun, May 26

alex-t committed rL361741: [AMDGPU] Divergence driven ISel. Assign register class for cross block….
[AMDGPU] Divergence driven ISel. Assign register class for cross block…
Sun, May 26, 1:31 PM

Fri, May 24

alex-t committed rL361644: [AMDGPU] Divergence driven ISel. Assign register class for cross block values….
[AMDGPU] Divergence driven ISel. Assign register class for cross block values…
Fri, May 24, 8:34 AM
alex-t closed D59990: AMDGPU. Divergence driven ISel. Assign register class for cross block values according to the divergence..
Fri, May 24, 8:33 AM · Restricted Project

Thu, May 23

alex-t updated the diff for D59990: AMDGPU. Divergence driven ISel. Assign register class for cross block values according to the divergence..

rebased

Thu, May 23, 9:15 AM · Restricted Project

May 20 2019

alex-t updated the diff for D59990: AMDGPU. Divergence driven ISel. Assign register class for cross block values according to the divergence..

more formatting + new test updated

May 20 2019, 1:37 AM · Restricted Project

May 15 2019

alex-t updated the diff for D59990: AMDGPU. Divergence driven ISel. Assign register class for cross block values according to the divergence..

formatting etc

May 15 2019, 5:13 AM · Restricted Project

May 14 2019

alex-t updated the diff for D59990: AMDGPU. Divergence driven ISel. Assign register class for cross block values according to the divergence..

Added fixes after extended testing. Also GFX10 related update.

May 14 2019, 5:07 AM · Restricted Project

Apr 23 2019

alex-t accepted D60999: AMDGPU: Fix LCSSA phi lowering in SILowerI1Copies.

LGTM

Apr 23 2019, 4:53 AM · Restricted Project

Apr 8 2019

alex-t added inline comments to D59990: AMDGPU. Divergence driven ISel. Assign register class for cross block values according to the divergence..
Apr 8 2019, 8:52 AM · Restricted Project
alex-t added a comment to D59990: AMDGPU. Divergence driven ISel. Assign register class for cross block values according to the divergence..
Apr 8 2019, 8:10 AM · Restricted Project

Apr 5 2019

alex-t added a comment to D59990: AMDGPU. Divergence driven ISel. Assign register class for cross block values according to the divergence..

Adding llvm-commits to the CC. Please be more careful about that in the future... see http://llvm.org/docs/Phabricator.html

Apr 5 2019, 1:42 AM · Restricted Project

Apr 4 2019

alex-t added inline comments to D59990: AMDGPU. Divergence driven ISel. Assign register class for cross block values according to the divergence..
Apr 4 2019, 7:50 AM · Restricted Project

Apr 3 2019

alex-t added inline comments to D59990: AMDGPU. Divergence driven ISel. Assign register class for cross block values according to the divergence..
Apr 3 2019, 4:37 AM · Restricted Project

Apr 2 2019

alex-t added a reviewer for D59990: AMDGPU. Divergence driven ISel. Assign register class for cross block values according to the divergence.: efriedma.
Apr 2 2019, 6:00 AM · Restricted Project
alex-t updated the diff for D59990: AMDGPU. Divergence driven ISel. Assign register class for cross block values according to the divergence..

changed according the reviewer request

Apr 2 2019, 5:58 AM · Restricted Project

Mar 29 2019

alex-t updated the diff for D59990: AMDGPU. Divergence driven ISel. Assign register class for cross block values according to the divergence..
Mar 29 2019, 8:15 AM · Restricted Project
alex-t created D59990: AMDGPU. Divergence driven ISel. Assign register class for cross block values according to the divergence..
Mar 29 2019, 6:41 AM · Restricted Project

Jan 3 2019

alex-t committed rL350350: [AMDGPU] Fix scalar operand folding bug that causes SHOC performance regression..
[AMDGPU] Fix scalar operand folding bug that causes SHOC performance regression.
Jan 3 2019, 11:59 AM
alex-t closed D56161: [AMDGPU] Fix scalar operand folding bug that causes SHOC performance regression.
Jan 3 2019, 11:59 AM
alex-t updated the summary of D56161: [AMDGPU] Fix scalar operand folding bug that causes SHOC performance regression.
Jan 3 2019, 10:50 AM
alex-t updated the summary of D56161: [AMDGPU] Fix scalar operand folding bug that causes SHOC performance regression.
Jan 3 2019, 10:08 AM
alex-t retitled D56161: [AMDGPU] Fix scalar operand folding bug that causes SHOC performance regression from [AMDGPU] Fix scalar operand folding. to [AMDGPU] Fix scalar operand folding bug that causes SHOC performance regression.
Jan 3 2019, 10:06 AM

Dec 30 2018

alex-t created D56161: [AMDGPU] Fix scalar operand folding bug that causes SHOC performance regression.
Dec 30 2018, 11:18 AM

Nov 14 2018

alex-t added inline comments to D54340: AMDGPU: Fix various issues around the VirtReg2Value mapping.
Nov 14 2018, 3:44 AM

Oct 26 2018

alex-t added a comment to D53496: AMDGPU: Rewrite SILowerI1Copies to always stay on SALU.

It seems like we have to further develop this approach to deal with the scalar comparison instructions.
For instance, S_CMP_* does not produce any result but implicitly defines SCC.
Thus, InstrEmitter will insert the copies all the time.
Since DAG operator SETCC produces i1 value there will be the SCC to VReg_1 copies.
I not trying to invent a method to lower that copies.
First issue: in case all the uses are not divergent I don't need the V_CND_MASK -1,0 -> V_CMP_NE 0 pair
I need S_CSELECT -1, 0 immediately after the definition (to save SCC) and S_CMP_NE 0 just before use to rematerialize SCC
Second issue: I only need to save/restore if there are SCC defs in between.
So, we need to take into account not divergent flow as well.

Oct 26 2018, 6:53 AM

Oct 25 2018

alex-t added inline comments to D53496: AMDGPU: Rewrite SILowerI1Copies to always stay on SALU.
Oct 25 2018, 6:27 AM

Oct 16 2018

alex-t accepted D53283: AMDGPU: Divergence-driven selection of scalar buffer load intrinsics.

LGTM

Oct 16 2018, 12:26 AM · Restricted Project

Oct 1 2018

alex-t committed rL343455: [AMDGPU] Divergence driven instruction selection. Shift operations..
[AMDGPU] Divergence driven instruction selection. Shift operations.
Oct 1 2018, 4:08 AM
alex-t closed D52559: [AMDGPU] Divergence driven instruction selection. Shift operations..
Oct 1 2018, 4:08 AM

Sep 28 2018

alex-t added inline comments to D52559: [AMDGPU] Divergence driven instruction selection. Shift operations..
Sep 28 2018, 6:25 AM
alex-t updated the diff for D52559: [AMDGPU] Divergence driven instruction selection. Shift operations..

Fixes according the discussion results.

Sep 28 2018, 6:18 AM

Sep 27 2018

alex-t added inline comments to D52559: [AMDGPU] Divergence driven instruction selection. Shift operations..
Sep 27 2018, 12:32 PM
alex-t added inline comments to D52559: [AMDGPU] Divergence driven instruction selection. Shift operations..
Sep 27 2018, 12:25 PM
alex-t added inline comments to D52559: [AMDGPU] Divergence driven instruction selection. Shift operations..
Sep 27 2018, 9:50 AM
alex-t added inline comments to D52559: [AMDGPU] Divergence driven instruction selection. Shift operations..
Sep 27 2018, 9:46 AM
alex-t added inline comments to D52559: [AMDGPU] Divergence driven instruction selection. Shift operations..
Sep 27 2018, 5:43 AM
alex-t updated the diff for D52559: [AMDGPU] Divergence driven instruction selection. Shift operations..

Pattern changed to GCNPat, divergence check added.

Sep 27 2018, 5:43 AM

Sep 26 2018

alex-t created D52559: [AMDGPU] Divergence driven instruction selection. Shift operations..
Sep 26 2018, 9:31 AM
alex-t closed D52019: [AMDGPU] Divergence driven instruction selection. Part 1..
Sep 26 2018, 9:28 AM

Sep 25 2018

alex-t accepted D52454: Run VerifyDAGDiverence in debug only.
Sep 25 2018, 12:41 PM
alex-t added a comment to D52454: Run VerifyDAGDiverence in debug only.

TargetTransformInfo::hasBranchDivergence() only returns true only if the divergence makes sense for the given target.
So, the compile time should be only affected for such targets: AMDGPU, NVPTX etc.

Sep 25 2018, 3:24 AM

Sep 21 2018

alex-t added a comment to D52019: [AMDGPU] Divergence driven instruction selection. Part 1..

Committed r342719.

Sep 21 2018, 3:37 AM
alex-t committed rL342719: [AMDGPU] Divergence driven instruction selection. Part 1..
[AMDGPU] Divergence driven instruction selection. Part 1.
Sep 21 2018, 3:34 AM

Sep 20 2018

alex-t updated the diff for D52019: [AMDGPU] Divergence driven instruction selection. Part 1..

MC/Disassembler/AMDGPU passed
Tests fixed

Sep 20 2018, 4:26 AM

Sep 19 2018

alex-t added inline comments to D52019: [AMDGPU] Divergence driven instruction selection. Part 1..
Sep 19 2018, 7:55 AM
alex-t updated the diff for D52019: [AMDGPU] Divergence driven instruction selection. Part 1..

Source cleanup.

Sep 19 2018, 7:55 AM

Sep 13 2018

alex-t set the repository for D52019: [AMDGPU] Divergence driven instruction selection. Part 1. to rL LLVM.
Sep 13 2018, 5:54 AM
alex-t closed D43334: AMDGPU: fix for SIRegisterInfo::isVGPR() crash.
Sep 13 2018, 5:53 AM
alex-t closed D51316: [AMDGPU] Preliminary patch for divergence driven instruction selection. Operands Folding 1..
Sep 13 2018, 5:53 AM
alex-t closed D51586: [AMDGPU] Preliminary patch for divergence driven instruction selection. Inline immediate move to V_MADAK_F32..
Sep 13 2018, 5:53 AM
alex-t closed D51975: [AMDGPU] Preliminary patch for divergence driven instruction selection. Load offset inlining pattern changed..
Sep 13 2018, 5:53 AM
alex-t closed D51931: [AMDGPU] Load divergence predicate refactoring.
Sep 13 2018, 5:49 AM
alex-t set the repository for D51975: [AMDGPU] Preliminary patch for divergence driven instruction selection. Load offset inlining pattern changed. to rL LLVM.
Sep 13 2018, 5:09 AM
alex-t committed rL342120: [AMDGPU] Load divergence predicate refactoring.
[AMDGPU] Load divergence predicate refactoring
Sep 13 2018, 2:08 AM
alex-t created D52019: [AMDGPU] Divergence driven instruction selection. Part 1..
Sep 13 2018, 1:54 AM

Sep 12 2018

alex-t added inline comments to D51931: [AMDGPU] Load divergence predicate refactoring.
Sep 12 2018, 11:43 PM
alex-t committed rL342115: [AMDGPU] Preliminary patch for divergence driven instruction selection..
[AMDGPU] Preliminary patch for divergence driven instruction selection.
Sep 12 2018, 11:37 PM
alex-t created D51975: [AMDGPU] Preliminary patch for divergence driven instruction selection. Load offset inlining pattern changed..
Sep 12 2018, 4:19 AM
alex-t updated the diff for D51931: [AMDGPU] Load divergence predicate refactoring.

Formatting fixed, function renamed.

Sep 12 2018, 3:21 AM

Sep 11 2018

alex-t created D51931: [AMDGPU] Load divergence predicate refactoring.
Sep 11 2018, 7:21 AM
alex-t committed rL341928: [AMDGPU] Preliminary patch for divergence driven instruction selection..
[AMDGPU] Preliminary patch for divergence driven instruction selection.
Sep 11 2018, 4:58 AM
alex-t closed D51734: [AMDGPU] Preliminary patch for divergence driven instruction selection. Immediate selection predicate changed.
Sep 11 2018, 4:58 AM

Sep 10 2018

alex-t committed rL341843: [AMDGPU] Preliminary patch for divergence driven instruction selection..
[AMDGPU] Preliminary patch for divergence driven instruction selection.
Sep 10 2018, 9:44 AM

Sep 7 2018

alex-t added a comment to D51316: [AMDGPU] Preliminary patch for divergence driven instruction selection. Operands Folding 1..

comitted: r341068
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341068 91177308-0d34-0410-b5e6-96231b3b80d8

Sep 7 2018, 2:22 AM
alex-t committed rL341636: [AMDGPU] Preliminary patch for divergence driven instruction selection. Fold….
[AMDGPU] Preliminary patch for divergence driven instruction selection. Fold…
Sep 7 2018, 2:10 AM
alex-t closed D51610: [AMDGPU] Preliminary patch for divergence driven instruction selection. Fold immediate SMRD offset..
Sep 7 2018, 2:10 AM

Sep 6 2018

alex-t created D51734: [AMDGPU] Preliminary patch for divergence driven instruction selection. Immediate selection predicate changed.
Sep 6 2018, 8:07 AM
alex-t updated the diff for D51610: [AMDGPU] Preliminary patch for divergence driven instruction selection. Fold immediate SMRD offset..

Unnecessary "isReg()" check removed.
Full context unified diff.

Sep 6 2018, 4:15 AM
alex-t updated the diff for D51586: [AMDGPU] Preliminary patch for divergence driven instruction selection. Inline immediate move to V_MADAK_F32..

Physical registers handling added. Test cases for physical registers added.

Sep 6 2018, 1:01 AM
alex-t retitled D51586: [AMDGPU] Preliminary patch for divergence driven instruction selection. Inline immediate move to V_MADAK_F32. from [AMDGPU] Preliminary patch for divergence driven instruction selection. Inline move immediate. to [AMDGPU] Preliminary patch for divergence driven instruction selection. Inline immediate move to V_MADAK_F32..
Sep 6 2018, 12:58 AM

Sep 4 2018

alex-t updated the diff for D51586: [AMDGPU] Preliminary patch for divergence driven instruction selection. Inline immediate move to V_MADAK_F32..

Formatting corrected.

Sep 4 2018, 4:36 AM
alex-t created D51610: [AMDGPU] Preliminary patch for divergence driven instruction selection. Fold immediate SMRD offset..
Sep 4 2018, 1:21 AM