Page MenuHomePhabricator

nhaehnle (Nicolai Hähnle)
User

Projects

User does not belong to any projects.

User Details

User Since
Oct 9 2015, 4:06 AM (223 w, 1 d)

Recent Activity

Thu, Jan 9

nhaehnle accepted D71386: [AMDGPU] Remove unnecessary v_mov from a register to itself in WQM lowering..

LGTM, with one minor nitpick.

Thu, Jan 9, 6:46 AM · Restricted Project
nhaehnle accepted D72349: AMDGPU/GlobalISel: Fix import of zext of s16 op patterns.

LGTM

Thu, Jan 9, 6:46 AM · Restricted Project
nhaehnle accepted D72346: AMDGPU/GlobalISel: Add IMMPopCount xform.

LGTM

Thu, Jan 9, 6:37 AM · Restricted Project
nhaehnle accepted D72345: AMDGPU/GlobalISel: Add selectVOP3Mods_nnan.

LGTM

Thu, Jan 9, 6:37 AM · Restricted Project
nhaehnle accepted D72340: AMDGPU/GlobalISel: Add equiv xform for bitcast_fpimm_to_i32.

Test case? LGTM apart from that.

Thu, Jan 9, 6:37 AM · Restricted Project
nhaehnle accepted D72338: AMDGPU/GlobalISel: Fix add of neg inline constant pattern.

LGTM

Thu, Jan 9, 6:37 AM · Restricted Project
nhaehnle added a comment to D72332: AMDGPU/GlobalISel: Custom legalize v2s16 G_SHUFFLE_VECTOR.

This makes sense, but I do have one comment.

Thu, Jan 9, 6:27 AM · Restricted Project
nhaehnle added a comment to D72325: [AMDGPU] Fix cluster size threshold calculation.

Don't we *want* clusters that large, and even larger?

Thu, Jan 9, 6:27 AM · Restricted Project

Dec 15 2019

nhaehnle added a comment to D71358: AMDGPU: Remove denormal subtarget features.

Once the form of the attributes is finalized, frontends can start emitting the new form and we just won’t look it it until this patch

Dec 15 2019, 1:40 PM · Restricted Project
nhaehnle added a comment to D71474: [TableGen] Introduce an if/then/else statement..

I agree with Hal that if/else should introduce a scope. Otherwise, you get into consistency problems. Consider:

multiclass Foo<...> {
   if ... then {
      defvar bar = ...;
   }
   defvar baz = bar;
}

Without introducing a scope, the above is presumably valid but in a rather surprising way. And a similar example with an else that also defines bar wouldn't be valid due to the redefinition.

Dec 15 2019, 1:40 PM · Restricted Project
nhaehnle added a comment to D71407: [TableGen] Introduce a `defvar` statement..
  • should a defvar in a foreach be local to it? (Probably)
  • should I allow defvar inside a class or def as well as in a multiclass, defining variables that vanish at the closing brace but can be referred to by the definitions of proper class member values? (Possibly)
  • is defvar the sensible name? (def and let were already taken; I considered C++ using, but thought defvar fits reasonably well with the existing defset)
Dec 15 2019, 1:31 PM · Restricted Project

Dec 13 2019

nhaehnle added a comment to D71213: [Alignment][NFC] CreateMemSet use MaybeAlign.

LLVM has this LLVM_ATTRIBUTE_DEPRECATED macro, it's convenient to get a warning but it only works when building without -Wall.

Dec 13 2019, 5:58 AM · Restricted Project, Restricted Project
nhaehnle accepted D70315: [InstCombine][AMDGPU] Trim more components of *buffer_load.

LGTM

Dec 13 2019, 5:58 AM · Restricted Project
nhaehnle added a comment to D71348: Add ExternalAAWrapperPass to createLegacyPMAAResults..

Thanks for the heads-up. I think this is perfectly fine from an AMDGPU perspective. It also seems like a simple oversight that the ExternalAAWrapperPass is not in that list, so this LGTM in general, but I'd wait a bit for somebody who is more familiar with alias analysis to weigh in.

Dec 13 2019, 5:49 AM · Restricted Project
nhaehnle added a comment to D71358: AMDGPU: Remove denormal subtarget features.

Would it be possible to do this patch in two steps:

Dec 13 2019, 5:22 AM · Restricted Project
nhaehnle added a comment to D71386: [AMDGPU] Remove unnecessary v_mov from a register to itself in WQM lowering..

This basically LGTM

Dec 13 2019, 5:21 AM · Restricted Project

Dec 11 2019

nhaehnle committed rGf21c081b78ec: CodeGen: Allow annotations on globals in non-zero address space (authored by nhaehnle).
CodeGen: Allow annotations on globals in non-zero address space
Dec 11 2019, 4:28 AM
nhaehnle closed D71208: CodeGen: Allow annotations on globals in non-zero address space.
Dec 11 2019, 4:28 AM · Restricted Project
nhaehnle added a comment to D71341: [VE,#4] Target vector intrinsics.

As a general rule, I think it would be preferable for patches such as this one to be split up further, especially when they touch common code. For example, why does a patch by the name "Target vector intrinsics" contain a change to SjLjEHPrepare?

Dec 11 2019, 4:18 AM · Restricted Project, Restricted Project, Restricted Project, Restricted Project, Restricted Project, Restricted Project
nhaehnle accepted D71191: [TableGen] Add bang-operators !getop and !setop..

Okay, digging in further, the feature is actually used quite a bit in pattern definitions like:

def : GCNPat <
  (VGPRImm<(i32 imm)>:$imm),
  (V_MOV_B32_e32 imm:$imm)
>;

or

def: Pat<(s32_0ImmPred:$s16), (A2_tfrsi imm:$s16)>;

The first really ought to be expressible by placing the $imm inside the inner DAG expression, but for some reason the TableGen backend doesn't allow that. I don't know about the second... maybe if it was wrapped as (i32 s32_0ImmPred:$s16)? Fundamentally, the feature seems to be used as a hacky workaround for the fact that SelectionDAG patterns don't have an "outer" DAG, and having a naked expression like s32_0ImmPred:$s16 is not possible.

Dec 11 2019, 4:00 AM · Restricted Project
nhaehnle added a comment to D71191: [TableGen] Add bang-operators !getop and !setop..

The name suffix _can_ be applied to the operator:

def op;
def Foo {
  dag x = (op:$blah 1:$foo, 2, 3);
}
// produces:
------------- Defs -----------------
def Foo {
  dag x = (op:blah 1:$foo, 2, 3);
}
def op {
}

Though the inconsistent output and undocumented nature of it suggests that this feature is hardly ever used...

Dec 11 2019, 3:44 AM · Restricted Project

Dec 10 2019

nhaehnle added a comment to D71208: CodeGen: Allow annotations on globals in non-zero address space.

My concern is that there's something that's going to blow up or miscompile if we start passing in constants that aren't in a regular address space. Aren't there kinds of annotations which get persisted into the emitted code?

Dec 10 2019, 12:06 AM · Restricted Project

Dec 9 2019

nhaehnle added a comment to D71191: [TableGen] Add bang-operators !getop and !setop..

This looks fine, but have you thought through !setop discarding the name on the operand? The only use of this feature seems to be naming DAG pattern nodes, a feature that seems to have been around forever but I haven't seen used in the code I've touched...

Dec 9 2019, 11:48 PM · Restricted Project
nhaehnle accepted D71192: AMDGPU: Fix AMDGPUUnifyDivergentExitNodes with no normal returns.

LGTM

Dec 9 2019, 11:39 PM · Restricted Project
nhaehnle accepted D70781: AMDGPU: Fix handling of infinite loops in fragment shaders.

Thanks, LGTM

Dec 9 2019, 11:39 PM · Restricted Project
nhaehnle accepted D71195: [TableGen] Permit dag operators to be unset..

LGTM

Dec 9 2019, 11:30 PM · Restricted Project
nhaehnle added a comment to D71208: CodeGen: Allow annotations on globals in non-zero address space.

The questions I'd like to have answered before I can approve this are:

  • whether there are clients of @llvm.global.annotations that will have problems with non-0 address spaces and
Dec 9 2019, 11:21 PM · Restricted Project
nhaehnle created D71208: CodeGen: Allow annotations on globals in non-zero address space.
Dec 9 2019, 6:50 AM · Restricted Project

Dec 8 2019

nhaehnle added a comment to D70781: AMDGPU: Fix handling of infinite loops in fragment shaders.

This mostly looks good, except I strongly suspect that all other export intrinsics should have their done bit set to 0 in this case.

Dec 8 2019, 10:45 AM · Restricted Project
Herald updated subscribers of D54439: CMake: Make most target symbols hidden by default.
Dec 8 2019, 10:36 AM · Restricted Project
nhaehnle added a comment to D71115: [TableGen] Add a permissive version of !con..

I don't feel particularly strongly either way, but it seems to me that allowing an unset operator could perhaps be the most elegant solution after all. I'm lacking the context of your particular use case, but it seems to me you're attempting to piece together DAG fragments in a generic way where the operator simply isn't known yet by some of the pieces of TableGen you're writing, and using an unset operator could be a more natural way of expressing that than coming up with some artificial operator that is simply thrown away later.

Dec 8 2019, 10:27 AM · Restricted Project
nhaehnle added a comment to D71132: PostRA Machine Sink should take care of COPY defining register that is a sub-register by another COPY source operand.

With this change, it looks as though accumulateUsedDefed is called multiple times on some paths. That seems wrong. It does seem plausible that we'd have to keep accumulating for all code paths.

Dec 8 2019, 10:18 AM · Restricted Project
nhaehnle accepted D65966: AMDGPU/SILoadStoreOptimizer: Improve merging of out of order offsets.

Thanks. This basically looks good to me, some minor nitpicks.

Dec 8 2019, 10:18 AM · Restricted Project
nhaehnle accepted D71045: AMDGPU/SILoadStoreOptimillzer: Refactor CombineInfo struct.

Thanks for doing this, LGTM.

Dec 8 2019, 9:59 AM · Restricted Project

Nov 20 2019

nhaehnle accepted D69794: [AMDGPU][SILoadStoreOptimizer] Merge TBUFFER loads/stores.

As a general rule, I think we should avoid unrelated whitespace changes, of which this patch has a few, because it can make merging/cherry-picking/rebasing/blaming more confusing. If you're using clang-format, a good way to achieve this is to pipe the output of git show -U0, git diff -U0 or similar into clang-format-diff -p1. It's acceptable here though.

Nov 20 2019, 12:45 AM · Restricted Project

Nov 16 2019

nhaehnle committed rGd8f7c68e28bd: AMDGPU/SILoadStoreOptimizer: fix a likely bug introduced recently (authored by nhaehnle).
AMDGPU/SILoadStoreOptimizer: fix a likely bug introduced recently
Nov 16 2019, 2:37 AM
nhaehnle closed D68690: AMDGPU/SILoadStoreOptimizer: fix a likely bug introduced recently.
Nov 16 2019, 2:37 AM · Restricted Project

Nov 13 2019

nhaehnle added a comment to D68690: AMDGPU/SILoadStoreOptimizer: fix a likely bug introduced recently.

ping^2

Nov 13 2019, 12:52 PM · Restricted Project

Oct 30 2019

nhaehnle added a comment to D68994: [RFC] Redefine `convergent` in terms of dynamic instances.

Thank you for your detailed comments!

Oct 30 2019, 7:00 AM · Restricted Project
nhaehnle updated the diff for D68994: [RFC] Redefine `convergent` in terms of dynamic instances.
  • add some more expository text and examples
  • handle the case of speculatable+convergent
Oct 30 2019, 7:00 AM · Restricted Project

Oct 29 2019

nhaehnle added a comment to D68690: AMDGPU/SILoadStoreOptimizer: fix a likely bug introduced recently.

On the issue of a test case: I don't think it's practically possible to hit the crash I was worried about right now, because AddrIdx[0] happens to be small enough in all cases that it will be a valid operand index in all cases, and then hasSameBaseAddress will bail out for machine instructions of a different class because the register comparison of the first index will fail. However, that's just an artifact of how machine operand indices happen to be laid out today. I still maintain that the change itself is the right way to do things.

Oct 29 2019, 6:08 AM · Restricted Project
nhaehnle added a comment to D68994: [RFC] Redefine `convergent` in terms of dynamic instances.

What is the expectiation of lowering of a loop like the one you mentioned above?

int divergent_key = ...;
int v = ...;
int sum;

for (;;;) {
  tok = @llvm.convergence.anchor()
  int uniform_key = readfirstlane(divergent_key);
  if (uniform_key == divergent_key) {
    sum = subgroup_reduce_add(v);
    @llvm.convergence.point(tok)
    break;
  }
}

In particular the expectation of lowering in LLVM-IR is typically that the block with the "break" is an exit block and as such is typically moved outside of the loop (when the threads are reconverted). Is the expectation to have LLVM.CONVERGENCE.POINT to be lowered to the backends with a pseudo instruction similar to DBG_VALUE (and then be ignored by regalloc and such) up to the point that final control-flow block ordering is defined so that the execution of the "unconverged" part of the exit block is lowered so that is executed inside the loop instead of outside?

Oct 29 2019, 3:46 AM · Restricted Project
nhaehnle added a comment to D69498: IR: Invert convergent attribute handling.

As you know, I am very much in favor of this change, and really anything that re-establishes the rule that code is treated maximally conservatively when there are no attributes or metadata.

Oct 29 2019, 3:46 AM · Restricted Project

Oct 22 2019

nhaehnle accepted D68585: AMDGPU/GlobalISel: Handle flat/global G_ATOMIC_CMPXCHG.

LGTM

Oct 22 2019, 6:56 AM
nhaehnle accepted D68309: GlobalISel: Implement widenScalar for G_INSERT_VECTOR_ELT.

LGTM

Oct 22 2019, 6:47 AM
nhaehnle added a comment to D69182: [AMDGPU] Fix Vreg_1 PHI lowering in SILowerI1Copies..

Right, I don't think RPO hurts, but I also didn't (and still don't) see how it would help due to the possible circularity between PHIs, so I'd rather stick to the simpler iteration. Otherwise somebody looking at the code later may think the RPO is important for some aspect of the algorithm.

Oct 22 2019, 5:06 AM · Restricted Project
nhaehnle accepted D69287: [AMDGPU] Allows tied operand subreg folding.

LGTM

Oct 22 2019, 4:57 AM · Restricted Project

Oct 18 2019

nhaehnle added a comment to D68453: TableGen: Allow 'a+b' in TableGen language.

Thanks. Any thoughts on the higher-level algorithmic questions?

Oct 18 2019, 2:44 AM · Restricted Project
nhaehnle accepted D67345: [InstCombine] Allow values with multiple users in SimplifyDemandedVectorElts.

LGTM

Oct 18 2019, 2:40 AM · Restricted Project

Oct 16 2019

nhaehnle added a comment to D69010: [AMDGPU] Supress unused sdwa insts generation.

I haven't looked into the set of instructions that are effected by this, but the TableGen LGTM.

Oct 16 2019, 8:08 AM · Restricted Project
nhaehnle added a comment to D68994: [RFC] Redefine `convergent` in terms of dynamic instances.

Are you planning on adding codegen support in a separate patch?

Oct 16 2019, 8:08 AM · Restricted Project

Oct 15 2019

nhaehnle removed reviewers for D68994: [RFC] Redefine `convergent` in terms of dynamic instances: arsenm, alex-t, tpr, t-tye.
Oct 15 2019, 8:56 AM · Restricted Project
nhaehnle created D68994: [RFC] Redefine `convergent` in terms of dynamic instances.
Oct 15 2019, 8:56 AM · Restricted Project
nhaehnle added a comment to D68309: GlobalISel: Implement widenScalar for G_INSERT_VECTOR_ELT.

Yeah, it doesn't seem very relevant for the element argument, though precisely because of that I'd feel more comfortable with ANYEXT.

Oct 15 2019, 5:06 AM
nhaehnle added a comment to D68970: AMDGPU: Fix infinite searches in SIFixSGPRCopies.

Thanks for tracking that down. I agree with Stas that SmallSet should be a better choice here. Also some minor comments on the tests, apart from that looks good to me.

Oct 15 2019, 2:37 AM · Restricted Project
nhaehnle accepted D64911: [AMDGPU] Extend the SI Load/Store optimizer.

LGTM

Oct 15 2019, 2:11 AM · Restricted Project

Oct 14 2019

nhaehnle added a comment to D68635: [AMDGPU] Come back patch for the 'Assign register class for cross block values according to the divergence.'.

Thank you for persisting through this.

Oct 14 2019, 9:57 AM · Restricted Project
nhaehnle added a comment to D68690: AMDGPU/SILoadStoreOptimizer: fix a likely bug introduced recently.

Testcase?

Oct 14 2019, 9:38 AM · Restricted Project
nhaehnle added inline comments to D64911: [AMDGPU] Extend the SI Load/Store optimizer.
Oct 14 2019, 9:38 AM · Restricted Project
nhaehnle added inline comments to D68873: [AMDGPU] Amend target loop unroll defaults.
Oct 14 2019, 9:38 AM · Restricted Project
nhaehnle added a comment to D68893: AMDGPU: Split flat offsets that don't fit in DAG.

Mostly LGTM, but I wonder about the high level intention here. Is this intended to expose new load/store merging opportunities? If so, is there a test for this? Or is there some part of SIFoldOperands that can now be removed?

Oct 14 2019, 9:38 AM

Oct 9 2019

nhaehnle added a comment to D65966: AMDGPU/SILoadStoreOptimizer: Improve merging of out of order offsets.

I think the code would benefit from the refactoring I've mentioned on the other patch, where the lists only hold a structure with information on a single instruction. Maybe call it CandidateInfo (information of one instruction, persistent in lists) vs. CombineInfo (information on a pair, only temporary).

Oct 9 2019, 4:18 AM · Restricted Project
nhaehnle created D68690: AMDGPU/SILoadStoreOptimizer: fix a likely bug introduced recently.
Oct 9 2019, 4:00 AM · Restricted Project
nhaehnle added a parent revision for D68690: AMDGPU/SILoadStoreOptimizer: fix a likely bug introduced recently: D65961: AMDGPU/SILoadStoreOptimizer: Optimize scanning for mergeable instructions.
Oct 9 2019, 4:00 AM · Restricted Project
nhaehnle added a child revision for D65961: AMDGPU/SILoadStoreOptimizer: Optimize scanning for mergeable instructions: D68690: AMDGPU/SILoadStoreOptimizer: fix a likely bug introduced recently.
Oct 9 2019, 4:00 AM · Restricted Project
nhaehnle added a comment to D65961: AMDGPU/SILoadStoreOptimizer: Optimize scanning for mergeable instructions.

Thank you for doing this, it seems quite useful. As a follow-up to this change, do you think it makes sense to refactor CombineInfo a bit? We have a list of mergeable instructions, but the CombineInfo structure also has fields for a second instruction, which are only for temporary use, which is a bit odd.

Oct 9 2019, 3:51 AM · Restricted Project
nhaehnle accepted D68600: AMDGPU/GlobalISel: Fix crash on wide constant load with VGPR pointer.

I do have one question. Apart from that it LGTM.

Oct 9 2019, 3:51 AM
nhaehnle added inline comments to D68092: [AMDGPU] Invert the handling of skip insertion..
Oct 9 2019, 3:42 AM · Restricted Project

Oct 8 2019

nhaehnle added inline comments to D68600: AMDGPU/GlobalISel: Fix crash on wide constant load with VGPR pointer.
Oct 8 2019, 8:26 AM
nhaehnle accepted D68607: [AMDGPU] Disable unused gfx10 dpp instructions.

LGTM

Oct 8 2019, 8:16 AM · Restricted Project
nhaehnle added a comment to D51932: [AMDGPU] Fix-up cases where writelane has 2 SGPR operands.

I think this should be good to go.

Oct 8 2019, 8:16 AM · Restricted Project
nhaehnle added inline comments to D68092: [AMDGPU] Invert the handling of skip insertion..
Oct 8 2019, 8:07 AM · Restricted Project
nhaehnle added a comment to D68453: TableGen: Allow 'a+b' in TableGen language.

This does seem like a useful addition (heh) to the grammar. There is one thing that can go wrong in the future: parentheses are already reserved for DAGs. Brackets and braces also already have their own purpose. We could designate (( without space for parenthesis, with the risk that ( ( and (( can mean different things, with the first one making sense in DAGs. Though DAG operators are usually defs, so the risk of weirdness may be acceptable.

Oct 8 2019, 6:51 AM · Restricted Project
nhaehnle added inline comments to D64911: [AMDGPU] Extend the SI Load/Store optimizer.
Oct 8 2019, 6:23 AM · Restricted Project
nhaehnle accepted D68424: [tblgen] Add getOperatorAsDef() to Record.

Thanks, this makes a lot of sense. LGTM.

Oct 8 2019, 6:14 AM · Restricted Project
nhaehnle added a comment to D66709: AMDGPU: Introduce a flag to disable mul24 intrinsic formation.

Huh, interesting. I guess nobody had the time to really dig into why. Anyway, thanks. I think it'd be good to put this kind of information into our commit messages.

Oct 8 2019, 6:14 AM
nhaehnle committed rGdf6e67697bfb: AMDGPU: Propagate undef flag during pre-RA exec mask optimizations (authored by nhaehnle).
AMDGPU: Propagate undef flag during pre-RA exec mask optimizations
Oct 8 2019, 5:47 AM
nhaehnle committed rG7febdb7f27df: MachineSSAUpdater: insert IMPLICIT_DEF at top of basic block (authored by nhaehnle).
MachineSSAUpdater: insert IMPLICIT_DEF at top of basic block
Oct 8 2019, 5:47 AM
nhaehnle closed D68184: AMDGPU: Propagate undef flag during pre-RA exec mask optimizations.
Oct 8 2019, 5:47 AM · Restricted Project
nhaehnle closed D68183: MachineSSAUpdater: insert IMPLICIT_DEF at top of basic block.
Oct 8 2019, 5:47 AM · Restricted Project
nhaehnle added a comment to D68309: GlobalISel: Implement widenScalar for G_INSERT_VECTOR_ELT.

For the vector itself and the inserted element, shouldn't this be using G_ANYEXT instead? Looking at the corresponding test, the G_SHL/G_ASHR on $vgpr0 should be unnecessary based on the original code.

Oct 8 2019, 5:46 AM

Oct 2 2019

nhaehnle added a comment to D66709: AMDGPU: Introduce a flag to disable mul24 intrinsic formation.

What's the reason for having this flag?

Oct 2 2019, 5:28 AM
nhaehnle accepted D68200: [AMDGPU] Extend buffer intrinsics with swizzling.

LGTM

Oct 2 2019, 5:20 AM · Restricted Project
nhaehnle added inline comments to D68184: AMDGPU: Propagate undef flag during pre-RA exec mask optimizations.
Oct 2 2019, 5:02 AM · Restricted Project
nhaehnle updated the diff for D68184: AMDGPU: Propagate undef flag during pre-RA exec mask optimizations.

Use getRegStateUndef

Oct 2 2019, 5:02 AM · Restricted Project

Sep 30 2019

nhaehnle added a comment to D67003: AMDGPU: Don't put constants in .text for Mesa.

FWIW, the reason that radeonsi uses amdgcn-- is the scratch buffer ABI.

Sep 30 2019, 1:00 PM · Restricted Project
nhaehnle accepted D65496: AMDGPU/SILoadStoreOptimizer: Add helper functions for working with CombineInfo.

LGTM, with two nitpicks inline.

Sep 30 2019, 11:13 AM · Restricted Project
nhaehnle added inline comments to D67148: [LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize.
Sep 30 2019, 10:54 AM · Restricted Project
nhaehnle added a comment to D68092: [AMDGPU] Invert the handling of skip insertion..

Thank you for working on this.

Sep 30 2019, 10:43 AM · Restricted Project
nhaehnle accepted D68197: AMDGPU/GlobalISel: Legalize G_GLOBAL_VALUE.

LGTM, with two nitpicks.

Sep 30 2019, 10:13 AM
nhaehnle accepted D67599: AMDGPU/GlobalISel: Select s1 src G_SITOFP/G_UITOFP.

LGTM

Sep 30 2019, 10:05 AM
nhaehnle added inline comments to D68200: [AMDGPU] Extend buffer intrinsics with swizzling.
Sep 30 2019, 10:03 AM · Restricted Project
nhaehnle added a comment to D67345: [InstCombine] Allow values with multiple users in SimplifyDemandedVectorElts.

This is a useful change, but there is an unfortunate asymmetry here in how the code is structured: in addition to extractelement, we could also have shufflevector users (or masked stores etc.). Presumably we'd be able to handle all of those together without duplicating the code. Is there a way to take this into account?

Sep 30 2019, 9:59 AM · Restricted Project

Sep 28 2019

nhaehnle created D68184: AMDGPU: Propagate undef flag during pre-RA exec mask optimizations.
Sep 28 2019, 9:41 AM · Restricted Project
nhaehnle created D68183: MachineSSAUpdater: insert IMPLICIT_DEF at top of basic block.
Sep 28 2019, 9:41 AM · Restricted Project

Aug 27 2019

nhaehnle added inline comments to D66666: [AMDGPU] Remove unnecessary movs for v_fmac operands.
Aug 27 2019, 8:56 AM · Restricted Project

Aug 7 2019

nhaehnle added a comment to D65813: Partially revert D61491 "AMDGPU: Be explicit about whether the high-word in SI_PC_ADD_REL_OFFSET is 0".

Hmm, those test changes were unexpected.

Aug 7 2019, 2:26 AM · Restricted Project

Aug 5 2019

nhaehnle accepted D65719: AMDGPU: Disambiguate v3f16 format in load/store tables.

LGTM

Aug 5 2019, 5:28 AM
nhaehnle accepted D65604: AMDGPU/GlobalISel: Alternative mappings for constants.

LGTM

Aug 5 2019, 5:25 AM
nhaehnle accepted D65601: AMDGPU/GlobalISel: Don't reject shader types.

Sure, why not.

Aug 5 2019, 5:22 AM