Page MenuHomePhabricator

nhaehnle (Nicolai Hähnle)
User

Projects

User does not belong to any projects.

User Details

User Since
Oct 9 2015, 4:06 AM (179 w, 6 d)

Recent Activity

Fri, Mar 15

nhaehnle accepted D58957: [AMDGPU] Add an experimental buffer fat pointer address space..

LGTM, whether you do the NoAlias between constant 32-bit and buffer address space here or separately. (Making NoAlias between constant 32-bit and the other existing address spaces should definitely be a different patch.)

Fri, Mar 15, 1:32 PM · Restricted Project, Restricted Project
nhaehnle accepted D59312: AMDGPU: Fix a SIAnnotateControlFlow issue when there are multiple backedges..

LGTM

Fri, Mar 15, 1:28 PM · Restricted Project
nhaehnle added a comment to D59312: AMDGPU: Fix a SIAnnotateControlFlow issue when there are multiple backedges..

Hmm, this is fragile -- I think your reasoning about domination is mostly sound, except when there's uniform control flow inside the loop itself, in which case I'm not sure. That said, it seems to me that the domination check is conservative, and your change shouldn't break anything that wasn't broken before, and a proper fix is potentially much more difficult.

Fri, Mar 15, 1:28 PM · Restricted Project

Thu, Mar 7

nhaehnle added inline comments to D58957: [AMDGPU] Add an experimental buffer fat pointer address space..
Thu, Mar 7, 1:30 AM · Restricted Project, Restricted Project

Tue, Mar 5

nhaehnle accepted D58895: [TableGen] Allow lists to be concatenated through '#'.

Awesome :)
LGTM

Tue, Mar 5, 2:20 AM · Restricted Project

Feb 12 2019

Herald updated subscribers of D57511: [DebugInfo] Stop changing labels for register-described parameter DBG_VALUEs.
Feb 12 2019, 1:19 AM · Restricted Project, debug-info
nhaehnle accepted D57894: AMDGPU: Fix @llvm.amdgcn.wqm.vote implementation.

FWIW, the problem is a bit more involved than that. Consider

bool value = ...;
if (divergent condition) {
  use(value);
}

So the undefinedness of inactive bits in this case is not due to suboptimal lowering of NOT, but inherently due to the fact that we're in a different region of control flow.

Feb 12 2019, 1:19 AM · Restricted Project
Herald updated subscribers of D53765: [RFC prototype] Implementation of asm-goto support in LLVM.
Feb 12 2019, 1:14 AM · Restricted Project
nhaehnle added a comment to D58017: [DAG] Add SimplifyDemandedBits support for BSWAP/BITREVERSE.

Hmm, this is one of those cases where it'd be awesome to have a Godbolt for the tests.

Feb 12 2019, 1:13 AM · Restricted Project
Herald updated subscribers of D58026: LLD: Preserve ABI version during linking ELF.
Feb 12 2019, 1:10 AM · Restricted Project
nhaehnle accepted D58077: [tablegen] Add locations to many PrintFatalError() calls.

+1 for better error messages! LGTM.

Feb 12 2019, 1:09 AM · Restricted Project

Feb 8 2019

nhaehnle added a comment to D57825: IR: Add immarg attribute.

@nhaehnle can you look at the InstCombineSimplifyDemanded change? I was a bit confused by the assert you added

The background of this is that if TFE/LWE is enabled, the SimplifyDemanded logic won't work as-is; but it should also never be hit, because intrinsic calls with TFE/LWE should have a struct return type (e.g. {v4f32,i32}) and the SimplifyDemanded logic doesn't support looking through that. The assert double-checks that.

In hindsight, it would be possible for somebody to manually create malformed IR which calls image intrinsics with TFE/LWE enabled but with a vector return type. That would be an error, and one could argue that the code should produce an error instead of an assert. It would require a broken frontend or manually written IR, though.

I would expect TFE to be turned on by the usage of the struct type. We should probably add a custom verifier check for this

Feb 8 2019, 8:02 AM
nhaehnle added a comment to D57825: IR: Add immarg attribute.

@nhaehnle can you look at the InstCombineSimplifyDemanded change? I was a bit confused by the assert you added

Feb 8 2019, 7:46 AM
nhaehnle added a comment to D57825: IR: Add immarg attribute.

Looks reasonable to me. Maybe give a little more time for people to give feedback? Though this has already been on llvm-dev without opposition...

Feb 8 2019, 4:27 AM

Feb 7 2019

nhaehnle added a comment to D57748: AMDGPU: Add inverse ballot intrinsic.

Why can't we recognize this as a pattern? Basically, it's just (src & (1 << thread_idx)), and thread_idx can be matched as a sequence of mbcnt intrinsics.

Feb 7 2019, 9:42 AM · Restricted Project
nhaehnle added a comment to D57737: [AMDGPU] Fix DPP sequence in atomic optimizer..

Did you actually test this? The shift-by-3 should be unnecessary.

Feb 7 2019, 3:54 AM · Restricted Project, Restricted Project
nhaehnle accepted D56496: [AMDGPU] Fix CS scratch setup on pre-GCN3 ASICs.

LGTM

Feb 7 2019, 3:32 AM · Restricted Project
nhaehnle added a comment to D55474: [AMDGPU] Extend constant folding for logical operations.

It looks like this was never committed. What's the next step here?

Feb 7 2019, 3:18 AM
nhaehnle abandoned D53161: Fix some cases where the index size was used instead of the pointer size.

I still find the code as-is a bit dubious, but we no longer need this change and the review process tends to be a bit of a pain, so I'm dropping this.

Feb 7 2019, 3:16 AM · Restricted Project
nhaehnle accepted D55444: AMDGPU: Fix DPP combiner.

LGTM

Feb 7 2019, 12:43 AM · Restricted Project, Restricted Project

Feb 4 2019

nhaehnle committed rGa69146e67eb7: [InstCombine] Cleanup the TFE/LWE check in AMDGPU SimplifyDemanded (authored by nhaehnle).
[InstCombine] Cleanup the TFE/LWE check in AMDGPU SimplifyDemanded
Feb 4 2019, 1:25 PM
nhaehnle added a comment to D57681: [InstCombine] Cleanup the TFE/LWE check in AMDGPU SimplifyDemanded.

Does it actually matter? I thought since this needs to be a constant, this just needs to not crash

Feb 4 2019, 1:12 PM · Restricted Project
nhaehnle created D57681: [InstCombine] Cleanup the TFE/LWE check in AMDGPU SimplifyDemanded.
Feb 4 2019, 5:03 AM · Restricted Project
nhaehnle added a reviewer for D57681: [InstCombine] Cleanup the TFE/LWE check in AMDGPU SimplifyDemanded: msearles.
Feb 4 2019, 5:03 AM · Restricted Project

Jan 15 2019

nhaehnle added a comment to D55444: AMDGPU: Fix DPP combiner.

Hi Valery, I really like the way the different cases are listed in the explanatory comment at the top of the file, and I believe those cases are correct. Would it be possible to restructure the code in a way that follows those cases? I think that would make it much easier to follow.

Jan 15 2019, 3:34 AM · Restricted Project, Restricted Project

Jan 2 2019

nhaehnle accepted D55179: AMDGPU: Remove v16i8 from register classes.

LGTM

Jan 2 2019, 1:04 AM

Dec 26 2018

nhaehnle added a comment to D56002: [AMDGPU] Fix a weird WWM intrinsic issue..

The only user of canReadVGPR is addUsersToMoveToVALUWorklist. Since the intended semantics of canReadVGPR aren't at all clear from the name, might I suggesting folding it into its only user?

Dec 26 2018, 8:36 AM · Restricted Project

Dec 18 2018

nhaehnle accepted D55602: AMDGPU/InsertWaitcnts: Update VGPR/SGPR bounds when brackets are merged.

Thanks, LGTM

Dec 18 2018, 7:21 AM

Dec 12 2018

nhaehnle accepted D54042: [AMDGPU] Extend the SI Load/Store optimizer to combine more things..

LGTM

Dec 12 2018, 2:12 AM · Restricted Project

Dec 10 2018

nhaehnle accepted D55367: [AMDGPU] Change the l1 flush instruction for AMDPAL/MESA3D..

LGTM

Dec 10 2018, 7:44 AM · Restricted Project

Dec 7 2018

nhaehnle added inline comments to D55435: [AMDGPU] Fix discarded result of addAttribute.
Dec 7 2018, 8:11 AM
nhaehnle added inline comments to D55402: [AMDGPU] Simplify negated condition.
Dec 7 2018, 8:08 AM
nhaehnle added a comment to D55369: AMDGPU: Use an ABS32_LO relocation for SCRATCH_RSRC_DWORD1.

I thought mesa was moving to stop using the relocations at all for this?

Dec 7 2018, 7:56 AM
nhaehnle added a comment to D55367: [AMDGPU] Change the l1 flush instruction for AMDPAL/MESA3D..

Please make the change apply to Mesa3D as well.

Dec 7 2018, 7:54 AM · Restricted Project
nhaehnle added a comment to D55181: AMDGPU: Convert tests away from llvm.SI.load.const.

Note, we still have one use of this in Mesa.

Dec 7 2018, 7:51 AM
nhaehnle accepted D55180: AMDGPU: Allow f32 types for llvm.amdgcn.s.buffer.load.

LGTM

Dec 7 2018, 7:50 AM
nhaehnle added a comment to D55179: AMDGPU: Remove v16i8 from register classes.

Can we also remove the setTruncStoreAction mention of v16i8 from SIISelLowering?

Dec 7 2018, 7:49 AM
nhaehnle accepted D55176: AMDGPU: Remove llvm.SI.buffer.load.dword.

LGTM

Dec 7 2018, 7:48 AM
nhaehnle accepted D55177: AMDGPU: Remove llvm.SI.tbuffer.store.

LGTM

Dec 7 2018, 7:48 AM
nhaehnle accepted D55175: AMDGPU: Remove llvm.AMDGPU.kill.

LGTM

Dec 7 2018, 7:47 AM
nhaehnle accepted D50633: [AMDGPU] Add new Mode Register pass.

Oops, I thought I'd posted this earlier... LGTM.

Dec 7 2018, 7:44 AM

Dec 6 2018

nhaehnle created D55369: AMDGPU: Use an ABS32_LO relocation for SCRATCH_RSRC_DWORD1.
Dec 6 2018, 7:01 AM

Dec 1 2018

nhaehnle added a comment to D53493: [DA] GPUDivergenceAnalysis for unstructured GPU kernels.

LGTM. Do you need me to commit this?

Yes, i cannot commit myself.

What are the current future plans for this whole series? I think we should enable this by default, but it would be nice to get rid of the old code, which requires addressing the irreducible test case. I vaguely recall that you mentioned a way to address that, but I can't find that now.

I will open a new patch series to address the irreducibility issue. That series should also involve deprecating the LegacyDivergenceAnalysis for good.

Dec 1 2018, 3:06 AM

Nov 30 2018

nhaehnle accepted D55093: [AMDGPU] Disable SReg Global LD/ST, perf regression.

Sad, but LGTM

Nov 30 2018, 6:09 AM
nhaehnle added a comment to D50633: [AMDGPU] Add new Mode Register pass.

One last thing. If I'm right about this, it'd be good to reduce the check complexity and indentation levels. Then I'm happy :)

Nov 30 2018, 6:08 AM
nhaehnle updated the diff for D54340: AMDGPU: Fix various issues around the VirtReg2Value mapping.
  • prefer to use !TRI.isSGPRReg
Nov 30 2018, 5:57 AM

Nov 29 2018

nhaehnle accepted D48826: [AMDGPU] Add support for TFE/LWE in image intrinsics.

LGTM

Nov 29 2018, 4:04 AM
nhaehnle added a comment to D52416: Allow FP types for atomicrmw xchg.

Should there be a change to LangRef?

Nov 29 2018, 4:00 AM
nhaehnle added a comment to D52785: [PseudoSourceValue] New category to represent floating-point status.

FWIW, I also think it's important not to drop MachineMemOperands. As Hal points out, MMOs carry non-optional information such as volatile-ness. And the Size is non-optional information at least for the AMDGPU backend.

Nov 29 2018, 3:51 AM
nhaehnle added inline comments to D50633: [AMDGPU] Add new Mode Register pass.
Nov 29 2018, 3:34 AM
nhaehnle added a comment to D50633: [AMDGPU] Add new Mode Register pass.

Thank you. One question left though.

Nov 29 2018, 3:29 AM
nhaehnle added a comment to D54340: AMDGPU: Fix various issues around the VirtReg2Value mapping.

ping

Nov 29 2018, 3:04 AM
nhaehnle added a comment to D51994: TableGen/ISel: Allow PatFrag predicate code to access captured operands.

ping^4

Nov 29 2018, 2:59 AM

Nov 25 2018

nhaehnle added a comment to D54855: [AMDGPU] An exp must be branched over if exec=0.

This should already be fixed in trunk. Trunk has a function TII->hasUnwantedEffectsWhenEXECEmpty to cover this, and this patch simply shouldn't apply.

Nov 25 2018, 2:34 PM

Nov 22 2018

nhaehnle updated the diff for D54228: AMDGPU/InsertWaitcnts: Simplify pending events tracking.
  • add TODO comment about the opportunity of keeping per-event timelines
Nov 22 2018, 5:16 AM
nhaehnle updated the diff for D54226: AMDGPU/InsertWaitcnts: Untangle some semi-global state.
  • Early-out in generateWaitcntInstBefore when no wait is needed. This helps keep the nesting complexity in check for later changes.
Nov 22 2018, 5:16 AM

Nov 19 2018

nhaehnle added a comment to D51994: TableGen/ISel: Allow PatFrag predicate code to access captured operands.

ping^3

Nov 19 2018, 4:05 AM
nhaehnle added a comment to D54231: AMDGPU/InsertWaitcnts: Remove the dependence on MachineLoopInfo.

ping

Nov 19 2018, 4:03 AM
nhaehnle added a comment to D54228: AMDGPU/InsertWaitcnts: Simplify pending events tracking.

I think the remarks by @t-tye point to a potentially useful optimization, but that should not be part of this patch.

Nov 19 2018, 4:03 AM
nhaehnle added a comment to D54226: AMDGPU/InsertWaitcnts: Untangle some semi-global state.

ping

Nov 19 2018, 4:02 AM
nhaehnle added inline comments to D54649: [FPEnv] Rough out constrained FCmp intrinsics.
Nov 19 2018, 3:59 AM · Restricted Project
nhaehnle accepted D54606: [AMDGPU] Convert insert_vector_elt into set of selects.

Yeah, the separate DAG combine for scalar select w/ undef is a better solution. LGTM.

Nov 19 2018, 3:54 AM
nhaehnle added inline comments to D54649: [FPEnv] Rough out constrained FCmp intrinsics.
Nov 19 2018, 3:51 AM · Restricted Project

Nov 16 2018

nhaehnle added a comment to D54606: [AMDGPU] Convert insert_vector_elt into set of selects.

However, why does code with undef vectors look so bad? For example, in float4_inselt, the fact that the initial vector is undef should allow us to just store a splat of 1.0.

Yes, I noticed that too. That needs to be a separate optimization. As far as I understand "insert_vector_element undef, %var, %idx" should not even come to this point. It needs to be replaced by build_vector (n x %var) regardless of the thresholds and heuristics I am using, e.g. earlier (higher in the same function I think).

Nov 16 2018, 10:07 AM
nhaehnle added a comment to D54516: [AMDGPU] Do not mark llvm.amdgcn.set.inactive as IntrNoMem.
In D54516#1301072, @tpr wrote:

EarlyCSE does seem to common up in this situation. And, if I disable that, I get GVN commoning it up.

By "disable", do you mean modifying EarlyCSE to not touch convergent calls? What if you do that for GVN as well? GVN::ValueTable::lookupOrAddCall() seems to be the right place for that. Such spot fixes as a useful step in the right direction, to be preferred over repeating the asm hack. One already exists in GVNHoist. This may sound a bit whackamoley, but the "effort" that @nhaehnle was asking about is essentially a more organized way to audit all the places where convergent calls should be handled specially. That project has not gained sufficient motivation yet to commit to.

Nov 16 2018, 9:58 AM
nhaehnle added a comment to D54516: [AMDGPU] Do not mark llvm.amdgcn.set.inactive as IntrNoMem.

I believe the combination of Convergent + not Speculatable should mean that the compiler shouldn't hoist it to a non-control-equivalent block and shouldn't CSE it. In particular, IIRC it's not guaranteed that that a readnone function always return the same value when it's called with the same arguments, so it's not safe to CSE -- it just means that LLVM can move other things across it, since it doesn't modify *caller-visible* state. What pass is causing a problem? Maybe it's a bug in the pass?

readnone is literally readnone. Maybe you're mixing this up with inaccessiblememonly?

No, I meant readnone, since AFAIK that's what IntrNoMem here maps to. The bit about "caller-visible state" was taken directly from the langref entry for readnone. My point was that in something like:

WWM code
if (cc) {
  other stuff v1
} else {
  identical WWM code
  other stuff v2
}

it wouldn't be allowed for LLVM to rewrite the use of the second WWM to point to the first, even now, since the semantics of readnone aren't strong enough to guarantee that the two calls return the same thing, at least as far as I understand (I think someone else, maybe Tom, explained this to me over IRC a while ago). Your issue is different, and indeed removing IntrNoMem isn't going to help with that at all. I agree that we need a better solution for that. But for the current patch, I can't think of a situation where removing NoMem/readonly would disallow a transform that shouldn't be allowed.

Nov 16 2018, 6:25 AM
nhaehnle added a comment to D48826: [AMDGPU] Add support for TFE/LWE in image intrinsics.

Thank you, this looks much cleaner. I only have a small number of nitpicks left.

Nov 16 2018, 4:43 AM
nhaehnle added a comment to D54516: [AMDGPU] Do not mark llvm.amdgcn.set.inactive as IntrNoMem.

I believe the combination of Convergent + not Speculatable should mean that the compiler shouldn't hoist it to a non-control-equivalent block and shouldn't CSE it. In particular, IIRC it's not guaranteed that that a readnone function always return the same value when it's called with the same arguments, so it's not safe to CSE -- it just means that LLVM can move other things across it, since it doesn't modify *caller-visible* state. What pass is causing a problem? Maybe it's a bug in the pass?

Nov 16 2018, 4:23 AM
nhaehnle added a comment to D54516: [AMDGPU] Do not mark llvm.amdgcn.set.inactive as IntrNoMem.

I agree this needs a test case.

Nov 16 2018, 4:18 AM
nhaehnle added a comment to D54340: AMDGPU: Fix various issues around the VirtReg2Value mapping.

Is there code for the lazy map that should be cleaned up now?

Nov 16 2018, 4:13 AM
nhaehnle updated the diff for D54340: AMDGPU: Fix various issues around the VirtReg2Value mapping.
  • add LLVM_ATTRIBUTE_UNUSED to prevent warning in optimized builds
  • cleanup isSDNodeSourceOfDivergence a bit more
Nov 16 2018, 4:12 AM
nhaehnle added a comment to D54606: [AMDGPU] Convert insert_vector_elt into set of selects.

Mostly looks good to me.

Nov 16 2018, 4:10 AM

Nov 15 2018

nhaehnle accepted D53840: Preprocessing support in tablegen.

LGTM

Nov 15 2018, 12:35 PM
nhaehnle added a comment to D50633: [AMDGPU] Add new Mode Register pass.

The update seems to have messed up the indentation of comments in a few places.

Nov 15 2018, 10:54 AM

Nov 9 2018

nhaehnle created D54340: AMDGPU: Fix various issues around the VirtReg2Value mapping.
Nov 9 2018, 11:51 AM
nhaehnle added a comment to D54226: AMDGPU/InsertWaitcnts: Untangle some semi-global state.

Note, the new waitcnt-preexisting.mir test shows this change.

Nov 9 2018, 7:15 AM
nhaehnle updated the diff for D54226: AMDGPU/InsertWaitcnts: Untangle some semi-global state.

Turns out I was a bit too quick in my analysis of the second point.
I thought the overly conservative waitcnt was due to the control flow
in the shader I was looking at, but it was actually due to a pre-existing
waitcnt.

Nov 9 2018, 7:14 AM
nhaehnle added a comment to D54228: AMDGPU/InsertWaitcnts: Simplify pending events tracking.

This is sufficient, because whenever only one event of a count type is

pending, its last time point is naturally the upper bound of all time
points of this count type, and when multiple event types are pending,
the count type has gone out of order and an s_waitcnt to 0 is required
to clear any pending event type (and will then clear all pending event
types for that count type).

Just wondered if can do better than using 0. Instead can the lowest count be used as this should be sufficient to ensure all out-of-order events in this have happened? I had discussed this with Bob at one time.

Nov 9 2018, 6:30 AM
nhaehnle accepted D53493: [DA] GPUDivergenceAnalysis for unstructured GPU kernels.
Nov 9 2018, 5:51 AM
nhaehnle added a comment to D53493: [DA] GPUDivergenceAnalysis for unstructured GPU kernels.

LGTM. Do you need me to commit this?

Nov 9 2018, 5:51 AM
nhaehnle accepted D54235: [AMDGPU] Always pass TRI into findRegister[Use/Def]OperandIdx.

LGTM

Nov 9 2018, 5:40 AM
nhaehnle added a comment to D54128: Fix MachineInstr::findRegisterUseOperandIdx subreg checks.

The code change looks fine to me, but it should be possible to cleanup the test case a bit.

Nov 9 2018, 5:38 AM
nhaehnle accepted D54164: [AMDGPU] Optimize S_CBRANCH_VCC[N]Z -> S_CBRANCH_EXEC[N]Z.

Two stylistic nitpicks. LGTM with those addressed.

Nov 9 2018, 5:33 AM

Nov 8 2018

nhaehnle added inline comments to D54042: [AMDGPU] Extend the SI Load/Store optimizer to combine more things..
Nov 8 2018, 10:18 AM · Restricted Project
nhaehnle added a comment to D54042: [AMDGPU] Extend the SI Load/Store optimizer to combine more things..

The huge switch statements are a poster child for the generic SearchableTables, somewhat analogous to what already exists for MIMGInstructions. Sketching it out:

class LoadStoreBaseOpcode {
  LoadStoreBaseOpcode BaseOpcode = !cast<LoadStoreBaseOpcode>(NAME);
  bit Srsrc;
  bit Sbase;
  ...
}
Nov 8 2018, 9:12 AM · Restricted Project

Nov 7 2018

nhaehnle added a comment to D53283: AMDGPU: Divergence-driven selection of scalar buffer load intrinsics.

Hi Nicolai,

Fyi, This introduced a regression with Mass Effect Andromeda with DXVK and RADV on Polaris10. See https://bugs.freedesktop.org/show_bug.cgi?id=108611

Nov 7 2018, 2:28 PM
nhaehnle added a parent revision for D54226: AMDGPU/InsertWaitcnts: Untangle some semi-global state: D54225: AMDGPU/InsertWaitcnts: Some more const-correctness.
Nov 7 2018, 2:19 PM
nhaehnle added a child revision for D54225: AMDGPU/InsertWaitcnts: Some more const-correctness: D54226: AMDGPU/InsertWaitcnts: Untangle some semi-global state.
Nov 7 2018, 2:19 PM
nhaehnle added a parent revision for D54227: AMDGPU/InsertWaitcnts: Use foreach loops for inst and wait event types: D54226: AMDGPU/InsertWaitcnts: Untangle some semi-global state.
Nov 7 2018, 2:19 PM
nhaehnle added a child revision for D54226: AMDGPU/InsertWaitcnts: Untangle some semi-global state: D54227: AMDGPU/InsertWaitcnts: Use foreach loops for inst and wait event types.
Nov 7 2018, 2:19 PM
nhaehnle added a parent revision for D54228: AMDGPU/InsertWaitcnts: Simplify pending events tracking: D54227: AMDGPU/InsertWaitcnts: Use foreach loops for inst and wait event types.
Nov 7 2018, 2:19 PM
nhaehnle added a child revision for D54227: AMDGPU/InsertWaitcnts: Use foreach loops for inst and wait event types: D54228: AMDGPU/InsertWaitcnts: Simplify pending events tracking.
Nov 7 2018, 2:19 PM
nhaehnle added a child revision for D54228: AMDGPU/InsertWaitcnts: Simplify pending events tracking: D54229: AMDGPU/InsertWaitcnt: Remove unused WaitAtBeginning.
Nov 7 2018, 2:19 PM
nhaehnle added a parent revision for D54229: AMDGPU/InsertWaitcnt: Remove unused WaitAtBeginning: D54228: AMDGPU/InsertWaitcnts: Simplify pending events tracking.
Nov 7 2018, 2:19 PM
nhaehnle added a parent revision for D54230: AMDGPU/InsertWaitcnt: Consistently use uint32_t for scores / time points: D54229: AMDGPU/InsertWaitcnt: Remove unused WaitAtBeginning.
Nov 7 2018, 2:17 PM
nhaehnle added a child revision for D54229: AMDGPU/InsertWaitcnt: Remove unused WaitAtBeginning: D54230: AMDGPU/InsertWaitcnt: Consistently use uint32_t for scores / time points.
Nov 7 2018, 2:17 PM
nhaehnle added a parent revision for D54231: AMDGPU/InsertWaitcnts: Remove the dependence on MachineLoopInfo: D54230: AMDGPU/InsertWaitcnt: Consistently use uint32_t for scores / time points.
Nov 7 2018, 2:17 PM
nhaehnle added a child revision for D54230: AMDGPU/InsertWaitcnt: Consistently use uint32_t for scores / time points: D54231: AMDGPU/InsertWaitcnts: Remove the dependence on MachineLoopInfo.
Nov 7 2018, 2:17 PM
nhaehnle created D54231: AMDGPU/InsertWaitcnts: Remove the dependence on MachineLoopInfo.
Nov 7 2018, 2:17 PM
nhaehnle created D54230: AMDGPU/InsertWaitcnt: Consistently use uint32_t for scores / time points.
Nov 7 2018, 2:16 PM