- User Since
- Jul 26 2016, 7:17 AM (164 w, 5 d)
Fri, Sep 20
LGTM but make sure it passes recently updated lit tests.
Thu, Sep 19
Wed, Sep 18
I also have a question: why do you need to remove that predicates at all? Does it mean that to enable GlobalISel you'd need to remove all the divergent predicates stuff?
Tue, Sep 17
Mon, Sep 16
Changed according the reviewer's request + some refactoring done.
Fri, Sep 13
Empty block handling added
Commit was reverted due to the EXPENSIVE_CHECKS failure.
Tue, Sep 10
Thu, Sep 5
Changed according to the reviewers requests. Tests added.
Tue, Sep 3
Aug 21 2019
Aug 15 2019
Suggested nits added.
Aug 8 2019
According the reviewer's request the check for implicit defs has been added.
New helper functions in MachineInstr.h are necessary because the current interface like MachineInstr::getNumExplicitDefs returns just the MCInstDesc::NumDefs for non variadic opcodes.
Jul 30 2019
Just ping! Does anybody has any objections?
Jul 2 2019
Jul 1 2019
isLCSSAForm check is under EXPENSIVE_CHECKS
Jun 28 2019
Jun 25 2019
Jun 24 2019
Jun 20 2019
Jun 19 2019
Jun 18 2019
MIR test added
Jun 6 2019
Jun 5 2019
Jun 4 2019
Jun 3 2019
Jun 2 2019
May 31 2019
added the Divergent Analysis test update that was missed
I don't agree that the enhancement the definition of the "divergence" to the scope is correct way at all.
literally, the value is uniform if all threads observe same value. Nothing about exec mask, lanes or GPU :)
All threads in our case are all executing loop body. That's it.
May 30 2019
Added comments describing the reason for the change.
May 29 2019
May 28 2019
May 27 2019
May 26 2019
May 24 2019
May 23 2019
May 20 2019
more formatting + new test updated
May 15 2019
May 14 2019
Added fixes after extended testing. Also GFX10 related update.
Apr 23 2019
Apr 8 2019
Apr 5 2019
Apr 4 2019
Apr 3 2019
Apr 2 2019
changed according the reviewer request
Mar 29 2019
Jan 3 2019
Dec 30 2018
Nov 14 2018
Oct 26 2018
It seems like we have to further develop this approach to deal with the scalar comparison instructions.
For instance, S_CMP_* does not produce any result but implicitly defines SCC.
Thus, InstrEmitter will insert the copies all the time.
Since DAG operator SETCC produces i1 value there will be the SCC to VReg_1 copies.
I not trying to invent a method to lower that copies.
First issue: in case all the uses are not divergent I don't need the V_CND_MASK -1,0 -> V_CMP_NE 0 pair
I need S_CSELECT -1, 0 immediately after the definition (to save SCC) and S_CMP_NE 0 just before use to rematerialize SCC
Second issue: I only need to save/restore if there are SCC defs in between.
So, we need to take into account not divergent flow as well.
Oct 25 2018
Oct 16 2018