Page MenuHomePhabricator

nhaehnle (Nicolai Hähnle)
User

Projects

User does not belong to any projects.

User Details

User Since
Oct 9 2015, 4:06 AM (296 w, 1 d)

Recent Activity

Today

nhaehnle added inline comments to D104013: [LangRef] State that the based-on relation is for pointer typed values only.
Sat, Jun 12, 12:28 AM · Restricted Project

Thu, May 20

nhaehnle committed rGa888e492f601: [IR] Memory intrinsics are not unconditionally `nosync` (authored by nhaehnle).
[IR] Memory intrinsics are not unconditionally `nosync`
Thu, May 20, 6:41 PM
nhaehnle closed D102295: [IR] Memory intrinsics are not unconditionally `nosync`.
Thu, May 20, 6:41 PM · Restricted Project
nhaehnle committed rG77b83d3088e6: [tests] Update Transforms/DeadStoreElim/multiblock-malloc-free.ll (authored by nhaehnle).
[tests] Update Transforms/DeadStoreElim/multiblock-malloc-free.ll
Thu, May 20, 5:37 PM

Tue, May 18

nhaehnle retitled D102295: [IR] Memory intrinsics are not unconditionally `nosync` from [FunctionAttr][Attributor] Cleanup `nosync` checks of memory intrinsics to [IR] Memory intrinsics are not unconditionally `nosync`.
Tue, May 18, 9:10 PM · Restricted Project
nhaehnle updated the diff for D102295: [IR] Memory intrinsics are not unconditionally `nosync`.

Invert the logic of the patch: mem intrinsics shouldn't be unconditionally marked nosync in the first place.

Tue, May 18, 9:09 PM · Restricted Project

May 11 2021

nhaehnle retitled D102295: [IR] Memory intrinsics are not unconditionally `nosync` from [FunctionAttr][Attributor][nosync] Remove redundant check to [FunctionAttr][Attributor] Cleanup `nosync` checks of memory intrinsics.
May 11 2021, 5:52 PM · Restricted Project
nhaehnle updated the diff for D102295: [IR] Memory intrinsics are not unconditionally `nosync`.

Fired too quickly, the Attributor actually had a bug and an incorrect lit test. This version should fix that.

May 11 2021, 5:51 PM · Restricted Project
nhaehnle requested review of D102295: [IR] Memory intrinsics are not unconditionally `nosync`.
May 11 2021, 5:42 PM · Restricted Project
nhaehnle requested review of D102290: [nofree] Only synchronization with release ordering breaks nofree.
May 11 2021, 5:20 PM · Restricted Project
nhaehnle updated the diff for D101701: [nofree] Refine concurrency requirements.

Split up the patch as per @reames' request

May 11 2021, 5:18 PM · Restricted Project
nhaehnle added a comment to D101701: [nofree] Refine concurrency requirements.

nofree functions imply nofree arguments so it very much makes a difference if we make it harder to infer
and annotate the former. As we already moved to a more semantic-based definition of escaped we can actually
look at an example where this makes a difference. Take pointers passed via memory to see how you cannot just
move the nofree to account for the change made by this patch:

struct { int *a } s;
s.a = malloc()               // this is a store.
nofree_nocapture_only(&s)    // we can potentially show `s.a` is not captured in the function but we cannot
                             // annotate that fact as a `nofree` argument without argument promotion, which
                             // we cannot always do.
// s.a is still known to be dereferenceable even without `nosync`. The function `nofree` is needed, `nofree`
// argument attributes are not involved.

Is that actually true? Regardless of the nocapture flag, s.a may be passed to another thread which then frees it. (nocapture only applies to &s in the first place, and even then, it does not say anything about temporary copies of the pointer that may be passed to other threads.)

May 11 2021, 3:53 PM · Restricted Project
nhaehnle added a comment to D101701: [nofree] Refine concurrency requirements.

Hi Johannes,

May 11 2021, 6:34 AM · Restricted Project

May 8 2021

nhaehnle added a comment to D101701: [nofree] Refine concurrency requirements.

Let me put down some thoughts:

  1. What is the motivation here? I mean, which functions, except very few intrinsic, would be "strong" nofree but not nosync?
May 8 2021, 10:43 PM · Restricted Project

May 6 2021

nhaehnle retitled D101701: [nofree] Refine concurrency requirements from [RFC][nofree] Refine concurrency requirements to [nofree] Refine concurrency requirements.
May 6 2021, 4:37 PM · Restricted Project
nhaehnle added inline comments to D101701: [nofree] Refine concurrency requirements.
May 6 2021, 4:35 PM · Restricted Project
nhaehnle updated the diff for D101701: [nofree] Refine concurrency requirements.

Address review feedback: Adding the required change to FunctionAttrs that @reames pointed out; in the new version, I also made sure to really go through all test cases.

May 6 2021, 4:34 PM · Restricted Project

May 5 2021

nhaehnle committed rG6adcdd26139c: [tests] Update Transforms/FunctionAttrs/nosync.ll (authored by nhaehnle).
[tests] Update Transforms/FunctionAttrs/nosync.ll
May 5 2021, 4:39 PM

May 1 2021

nhaehnle added a comment to D100676: [nofree] Attempt to further refine concurrency/capture requirements.

I tried my hands at a change in the opposite direction in D101701

May 1 2021, 2:48 PM · Restricted Project
nhaehnle requested review of D101701: [nofree] Refine concurrency requirements.
May 1 2021, 2:47 PM · Restricted Project

Apr 28 2021

nhaehnle added a comment to D100676: [nofree] Attempt to further refine concurrency/capture requirements.

Apologies if this is obvious somehow, but could you please point at those "deref" changes? I think I missed that part of the conversation. My hope would be that it could be treated orthogonally to the nofree/nosync question.

Apr 28 2021, 3:07 PM · Restricted Project

Apr 23 2021

nhaehnle added a comment to D69498: IR: Invert convergent attribute handling.

I don't have much to add to the conversation except to point out that this definition defines noconvergent in terms of divergent control flow, but the langref itself doesn't define what divergent control flow is, which makes it an incomplete spec. (Perhaps I'm just restating @arsenm's objections.) This seems unsatisfactory to me but I have no idea what to do about it. I agree with @sameerds that the current definition of convergent is too restrictive because in practice we really do want to be able to move convergent calls past uniform control flow.

Apr 23 2021, 9:34 AM · Restricted Project

Apr 17 2021

nhaehnle added a comment to D100676: [nofree] Attempt to further refine concurrency/capture requirements.

I think the crux of the discussion is the second half of this sentence:

Apr 17 2021, 4:52 AM · Restricted Project

Apr 15 2021

nhaehnle added a comment to D100141: [nofree] Restrict semantics to memory visible to caller.

Thank you for doing this.

Apr 15 2021, 8:03 AM · Restricted Project

Apr 13 2021

nhaehnle added a comment to D85603: IR: Add convergence control operand bundle and intrinsics.

To address this there was an attempt to invert the behavior of convergent attribute in this patch (https://reviews.llvm.org/D69498) then the frontend wouldn't need to generate the attribute everywhere and the optimizer wouldn't need to undo what frontend does. The change in this review doesn't address (2) as far as I can see - it seems it only generalized old convergent semantics to cover the cases with non-uniform CF. I am not clear yet about the details of how and what frontend should generate in IR for this new logic but it looks more complex than before. And if we have to stick to the conservative approach of assuming everything is convergent as it is now this might complicate and slow down the parsing. So I am just checking whether addressing (2) is still feasible with the new approach or it is not a direction we can/should go?

Apr 13 2021, 8:18 AM · Restricted Project
nhaehnle added a comment to D85603: IR: Add convergence control operand bundle and intrinsics.

Hi @jholewinski, sorry for missing your comment earlier. It's been a while! I still need to work through the rest of the comments here, but there's a pretty crucial point here that seems to have been missed:

Apr 13 2021, 8:10 AM · Restricted Project
nhaehnle added a comment to D85609: Transforms: add ConvergenceControlHeuristic pass.

Hi @Anastasia, thank you for your comments. I replied inline, but a high-level point upfront is that in many ways, this patch only exists because HLL don't really have well-defined semantics for convergent operations yet. Most of us have a shared mental model of what they should be for most high-level constructs, but intuition breaks down for the trickier corner cases. I made some proposals in the Khronos Memory Model TSG for how useful semantics could be added to HLL, taking the corner cases into account, but on my end all of this is on hold while I'm on leave.

Apr 13 2021, 7:58 AM · Restricted Project

Apr 7 2021

nhaehnle added a comment to D99883: [TableGen] Make behavior of list slice suffix consistent across all values.

The change makes sense to me. Does this need a documentation update?

Apr 7 2021, 12:14 AM · Restricted Project

Apr 4 2021

nhaehnle added a comment to D99507: [amdgpu] Add a pass to avoid jump into blocks with 0 exec mask..

I think this requires a lot more thought.

+1

What I'd like to know: why are we reloading a lane mask via V_READFIRSTLANE in the first place? I would expect one of two types of reload:

  1. Load from a fixed lane of a VGPR using V_READLANE.

That depends on how we spill a SGPR by writing a fixed lane or write an active lane. The 1st one, without saving/restoring, we will overwrite the live values in the inactive lanes. HPC workloads are hit by that issue and cannot run correctly.

Apr 4 2021, 6:35 AM · Restricted Project

Apr 1 2021

nhaehnle requested changes to D99507: [amdgpu] Add a pass to avoid jump into blocks with 0 exec mask..

I think this requires a lot more thought.

Apr 1 2021, 3:12 PM · Restricted Project

Dec 29 2020

nhaehnle committed rGb76014a4f15a: RegionInfo: use a range-based for loop [NFCI] (authored by nhaehnle).
RegionInfo: use a range-based for loop [NFCI]
Dec 29 2020, 7:01 AM
nhaehnle closed D92932: RegionInfo: use a range-based for loop [NFCI].
Dec 29 2020, 7:01 AM · Restricted Project

Dec 27 2020

nhaehnle added a comment to D93376: [LangRef] Clarify the semantics of lifetime intrinsics.

It's fair to say that this patch is a change of semantics by forbidding the use of the lifetime intrinsics in certain places where they were previously allowed.

They were allowed by the LangRef but not supported in the code

Dec 27 2020, 1:45 AM · Restricted Project

Dec 23 2020

nhaehnle added inline comments to D93708: [AMDGPU] Add a new Clamp Pattern to the GlobalISel Path..
Dec 23 2020, 11:27 PM · Restricted Project, Restricted Project

Dec 21 2020

nhaehnle added a comment to D93376: [LangRef] Clarify the semantics of lifetime intrinsics.

Also, I did not suggest to change the semantics without an RFC and everything else that is needed. You suggest otherwise in your response. I did propose alternatives to fix your bug, but most of my responses show how the restrictions you want to put in place are making (potentially existing) optimizations invalid.

The thing is, in our view (aqjune, nlopes, and me) you *are* changing the semantics by allowing the lifetime intrinsics to be used on things like "malloc". From all we can tell, the intrinsics were never meant to be used with anything but alloca, so for all intents and purposes, currently, they only support alloca.

Dec 21 2020, 9:52 PM · Restricted Project
nhaehnle added inline comments to D93125: Update AMDGPU PAL usage documentation.
Dec 21 2020, 12:24 PM · Restricted Project

Dec 14 2020

nhaehnle added reviewers for D93125: Update AMDGPU PAL usage documentation: tpr, foad.
Dec 14 2020, 9:59 AM · Restricted Project
nhaehnle added a comment to D93125: Update AMDGPU PAL usage documentation.

Thanks for expanding on this documentation.

Dec 14 2020, 9:59 AM · Restricted Project

Dec 9 2020

nhaehnle added inline comments to D92661: [RFC] Fix TLS and Coroutine.
Dec 9 2020, 7:50 AM · Restricted Project, Restricted Project
nhaehnle abandoned D83088: Introduce CfgTraits abstraction.

Superseded by D92924, D92925, D92926

Dec 9 2020, 7:26 AM · Restricted Project, Restricted Project, Restricted Project
nhaehnle updated the diff for D92924: Add opaque Handle infrastructure.

Remove HandleWrapperFor which got de facto superseded by SsaContext
(subsequent patch). Update comment slightly to match.

Dec 9 2020, 7:24 AM · Restricted Project
nhaehnle added a reviewer for D92932: RegionInfo: use a range-based for loop [NFCI]: dblaikie.

Seems trivial enough, and there doesn't really seem to be an owner of this code...

Dec 9 2020, 4:31 AM · Restricted Project
nhaehnle requested review of D92932: RegionInfo: use a range-based for loop [NFCI].
Dec 9 2020, 4:30 AM · Restricted Project
nhaehnle updated the diff for D83094: Analysis: Add a GenericCycleInfo analysis.

v9:

  • rebase on factored-out Handle infrastructure
Dec 9 2020, 2:47 AM · Restricted Project
nhaehnle updated the diff for D83089: DomTree: Extract (mostly) read-only logic into type-erased base classes.

v9:

  • rebase on factored-out Handle infrastructure
Dec 9 2020, 2:46 AM · Restricted Project
nhaehnle reopened D83089: DomTree: Extract (mostly) read-only logic into type-erased base classes.
Dec 9 2020, 2:46 AM · Restricted Project
nhaehnle requested review of D92926: Add ISsaContext: Type-erased wrapper for SsaContext.
Dec 9 2020, 2:45 AM · Restricted Project
nhaehnle requested review of D92925: Add SsaContext: Template to access generic IR functionality.
Dec 9 2020, 2:44 AM · Restricted Project
nhaehnle requested review of D92924: Add opaque Handle infrastructure.
Dec 9 2020, 2:44 AM · Restricted Project

Nov 26 2020

nhaehnle added inline comments to D92086: Generalized PatternMatch & InstSimplify.
Nov 26 2020, 12:07 AM · Restricted Project

Nov 25 2020

nhaehnle added a comment to D91174: [Support] Introduce a new InstructionCost class.
(comparison being a total ordering)
/// This avoids having to add asserts the comparison operators that the states
/// are valid and users can test for validity of the cost explicitly.

Counterpoint: The asserts cost nothing in a release build, and for software maintainability, having the asserts once in the comparison operator is better than having them at every callsite. Users of this class will forget to put them there.

Nov 25 2020, 1:33 PM · Restricted Project
nhaehnle added a comment to D91086: AMDGPU: Document why we use (non-volatile) BUFFER_WBINVL1 in graphics.

After our internal discussion of this fizzled out, I am no longer actually sure whether this analysis is correct.

Nov 25 2020, 7:56 AM · Restricted Project

Nov 9 2020

nhaehnle updated the diff for D91086: AMDGPU: Document why we use (non-volatile) BUFFER_WBINVL1 in graphics.

It seems the MTYPE numbering was changed on gfx9, reflect that in the comment.

Nov 9 2020, 9:10 AM · Restricted Project
nhaehnle requested review of D91086: AMDGPU: Document why we use (non-volatile) BUFFER_WBINVL1 in graphics.
Nov 9 2020, 8:41 AM · Restricted Project

Nov 6 2020

nhaehnle added a comment to D85603: IR: Add convergence control operand bundle and intrinsics.
  • Will this paint us into a corner wrt CUDA, and specifically sm70+?

/me summons @wash, who is probably a better person to speak to this than me.

My understanding is that the semantics of <sm70 convergent are pretty similar to what is described in these examples. But starting in sm70+, each sync operation takes an arg specifying which threads in the warp participate in the instruction.

I admit I do not fully understand what the purpose of this is. At one point in time I thought it was to let humans write (or compilers generate) code like this, where the identity of the convergent instruction does not matter.

// Warning, does not seem to work on sm75
if (cond)
  __syncwarp(FULL_MASK);
else
  __syncwarp(FULL_MASK);

but my testcase, https://gist.github.com/50d1b5fedc926c879a64436229c1cc05, dies with an illegal-instruction error (715) when I make cond have different values within the warp. So, guess not?

Anyway, clearly I don't fully understand the sm70+ convergence semantics. I'd ideally like someone from nvidia (hi, @wash) to speak to whether we can represent their convergent instruction semantics using this proposal. Then we should also double-check that clang can in fact generate the relevant LLVM IR.

To extrapolate from Vinod's answer, I would say that we can represent sm70+ convergence semantics with this proposal. The situation seems to be covered by the examples in the section on hoisting and sinking. Consider the following example copied from the spec:

define void @example(...) convergent {
  %entry = call token @llvm.experimental.convergence.entry()
  %data = ...
  %id = ...
  if (condition) {
    %shuffled = call i32 @subgroupShuffle(i32 %data, i32 %id) [ "convergencectrl"(token %entry) ]
    ...
  }
}

Here, hoisting subgroupShuffle() is generally disallowed because it depends on the identity of active threads. A CUDA builtin with a mask argument similarly identifies specific threads that must be active at the set of textually unaligned calls that synchronize with each other. So any change in the control flow surrounding those calls is generally disallowed without more information. The new representation doesn't seem to restrict a more informed optimizer that can predict how the threads evolve.

Nov 6 2020, 8:44 AM · Restricted Project

Nov 5 2020

nhaehnle added a comment to D90708: [LangRef] Clarify GEP inbounds wrapping semantics.

This makes sense to me.

Nov 5 2020, 2:10 AM · Restricted Project

Nov 4 2020

nhaehnle accepted D90635: [TableGen] Add true and false literals to represent booleans.

LGTM

Nov 4 2020, 8:57 AM · Restricted Project
nhaehnle added inline comments to D90036: [AMDGPU] Emit stack frame size in metadata.
Nov 4 2020, 8:42 AM · Restricted Project
nhaehnle added a comment to D90039: [AsmWriter] Factor out mnemonic generation to accessible getMnemonic..

This seems reasonable to me.

Nov 4 2020, 8:32 AM · Restricted Project
nhaehnle added a comment to D88777: [AMDGPU] Add SI_EARLY_TERMINATE_SCC0 for early terminating shader.

LGTM modulo the inline comment.

Nov 4 2020, 8:26 AM · Restricted Project
nhaehnle added a comment to D90057: [TableGen] [test] Change integer ranges to use new '...' punctuation..

Does this mean it's time to remove the lo-hi syntax altogether?

Nov 4 2020, 8:19 AM · Restricted Project

Oct 30 2020

nhaehnle added a comment to D83088: Introduce CfgTraits abstraction.

I'm going to follow up with another RFC about this on llvm-dev.

Oct 30 2020, 11:56 AM · Restricted Project, Restricted Project, Restricted Project
nhaehnle added a comment to D85603: IR: Add convergence control operand bundle and intrinsics.

I'm going to try to give feedback, but with the caveat that there's a huge amount of discussion here, and with my apologies that I can't read the whole thread's worth of context. It's a lot. Sorry that I'm probably bringing up things that have already been discussed.

Oct 30 2020, 2:42 AM · Restricted Project
nhaehnle updated the diff for D85603: IR: Add convergence control operand bundle and intrinsics.

Address the comments from @jlebar that I indicate I'd address,
except for changes affecting the Verifier -- I'll do those later.

Oct 30 2020, 2:39 AM · Restricted Project

Oct 27 2020

nhaehnle added a reverting change for rG03a5f7ce12e2: Try to make GCC5 happy about the CfgTraits thing: rGe025d09b216d: Revert multiple patches based on "Introduce CfgTraits abstraction".
Oct 27 2020, 12:34 PM
nhaehnle added a reverting change for rGc0cdd22c72fa: Introduce CfgTraits abstraction: rGe025d09b216d: Revert multiple patches based on "Introduce CfgTraits abstraction".
Oct 27 2020, 12:34 PM
nhaehnle added a reverting change for rGf2a06875b604: Wrap CfgTraitsFor in namespace llvm to please GCC 5: rGe025d09b216d: Revert multiple patches based on "Introduce CfgTraits abstraction".
Oct 27 2020, 12:34 PM
nhaehnle added a reverting change for rGa74fc481588f: CfgInterface: rename interface() to getInterface(): rGe025d09b216d: Revert multiple patches based on "Introduce CfgTraits abstraction".
Oct 27 2020, 12:34 PM
nhaehnle committed rGe025d09b216d: Revert multiple patches based on "Introduce CfgTraits abstraction" (authored by nhaehnle).
Revert multiple patches based on "Introduce CfgTraits abstraction"
Oct 27 2020, 12:34 PM
nhaehnle added a reverting change for D83088: Introduce CfgTraits abstraction: rGe025d09b216d: Revert multiple patches based on "Introduce CfgTraits abstraction".
Oct 27 2020, 12:34 PM · Restricted Project, Restricted Project, Restricted Project
nhaehnle added a reverting change for rG848a68a032d1: DomTree: Extract (mostly) read-only logic into type-erased base classes: rGce6900c6cb51: Revert "DomTree: Extract (mostly) read-only logic into type-erased base classes".
Oct 27 2020, 12:34 PM
nhaehnle committed rGce6900c6cb51: Revert "DomTree: Extract (mostly) read-only logic into type-erased base classes" (authored by nhaehnle).
Revert "DomTree: Extract (mostly) read-only logic into type-erased base classes"
Oct 27 2020, 12:34 PM
nhaehnle added a reverting change for D83089: DomTree: Extract (mostly) read-only logic into type-erased base classes: rGce6900c6cb51: Revert "DomTree: Extract (mostly) read-only logic into type-erased base classes".
Oct 27 2020, 12:34 PM · Restricted Project

Oct 26 2020

nhaehnle added a comment to D89995: Make the post-commit review expectations more explicit with respect to revert.

Overall, I think the policy change proposed here isn't entirely unreasonable, but I do think it needs to be treated as a change of policy. Not everybody is necessarily aware of a 4 year old email thread; heck, I've been working on upstream LLVM and frontends for 5 years and I wasn't aware of this supposed policy where in reality, things weren't actually done according to what's written in the policy documents. If I don't know this after 5 years, how can anybody joining in the last 4 years possibly be aware other than through mere chance. What's written in the documents needs to matter, if only as part of making LLVM a welcoming and inclusive community. Hidden tribal knowledge is the opposite of those ideals.

Oct 26 2020, 10:38 AM · Restricted Project
nhaehnle added a comment to D85603: IR: Add convergence control operand bundle and intrinsics.

ping^2

Oct 26 2020, 8:23 AM · Restricted Project

Oct 24 2020

nhaehnle added a comment to D83088: Introduce CfgTraits abstraction.

I replied on llvm-dev.

Oct 24 2020, 3:14 PM · Restricted Project, Restricted Project, Restricted Project
nhaehnle requested changes to D89995: Make the post-commit review expectations more explicit with respect to revert.

Considering that you are doing this in response to D83088, I think we have pre-existing evidence that this isn't workable as-is. There has to be some obligation on the people "thinking there should be more review" to actually engage in review. In particular, if they have previously demonstrated that they cannot be relied on for this, then the rule should not apply.

Oct 24 2020, 1:14 PM · Restricted Project

Oct 23 2020

nhaehnle added a comment to D83088: Introduce CfgTraits abstraction.

Hi Mehdi, this is not an appropriate place for this discussion. Yes, we have a general rule that patches can be reverted if they're obviously broken (e.g. build bot problems) or clearly violate some other standard. This is a good rule, but it doesn't apply here. If you think it does, please state your case in the email thread that I've started on llvm-dev for this very purpose. Just one thing:

Oct 23 2020, 8:16 AM · Restricted Project, Restricted Project, Restricted Project
nhaehnle committed rGa74fc481588f: CfgInterface: rename interface() to getInterface() (authored by nhaehnle).
CfgInterface: rename interface() to getInterface()
Oct 23 2020, 7:52 AM

Oct 22 2020

nhaehnle added a comment to D89826: [FunctionAttrs][NPM] Fix handling of convergent.

Oh, and on the change itself: I'm not familiar enough with the attributor framework to judge the implementation, but the described reasoning (being able to make deductions from indirect calls) is sound.

Oct 22 2020, 12:05 PM · Restricted Project
nhaehnle added a comment to D89826: [FunctionAttrs][NPM] Fix handling of convergent.

At a high level, it should be fine to assume an indirect call without the "convergent" attribute isn't convergent. If the language rules for some language say the callee of an indirect call might be convergent, the frontend can add the convergent attribute to that call. (Unlike most attributes, it's a negative attribute: it restricts optimizations, and attribute inference optimizations would remove it,)

I don't object to that but I want us to remove the "may" from the lang ref and make it a "will" (or similar). It describes the semantics not the optimization we "may" perform.

Oct 22 2020, 12:04 PM · Restricted Project
nhaehnle added a comment to D89880: [AMDGPU] Reorder SIMemoryLegalizer functions to be consistent.

If we're nitpicking over such details, I would point out that it would be better to structure the whole thing differently, with a single legalizer class in which the insertAcquire etc. methods have different paths based on the hardware generation.

Oct 22 2020, 11:51 AM · Restricted Project
nhaehnle accepted D89962: [TableGen] Update documents to make them more complete.

LGTM

Oct 22 2020, 8:45 AM · Restricted Project
nhaehnle accepted D88081: [AMDGPU] Move WQM Pass after MI Scheduler.

LGTM

Oct 22 2020, 8:41 AM · Restricted Project
nhaehnle accepted D89814: [TableGen] Change !getop and !setop to !getdagop and !setdagop.

LGTM

Oct 22 2020, 8:33 AM · Restricted Project, Restricted Project
nhaehnle added inline comments to D88540: [AMDGPU] Add amdgpu_gfx calling convention.
Oct 22 2020, 7:59 AM · Restricted Project
nhaehnle added a comment to D88540: [AMDGPU] Add amdgpu_gfx calling convention.

I noticed what I believe is one more correctness issues, plus a few assorted comments.

Oct 22 2020, 7:58 AM · Restricted Project

Oct 21 2020

nhaehnle added a comment to D83088: Introduce CfgTraits abstraction.

David, I don't think this is appropriate here. Let's take the discussion to llvm-dev.

Oct 21 2020, 12:58 PM · Restricted Project, Restricted Project, Restricted Project

Oct 20 2020

nhaehnle committed rG848a68a032d1: DomTree: Extract (mostly) read-only logic into type-erased base classes (authored by nhaehnle).
DomTree: Extract (mostly) read-only logic into type-erased base classes
Oct 20 2020, 10:53 AM
nhaehnle closed D83089: DomTree: Extract (mostly) read-only logic into type-erased base classes.
Oct 20 2020, 10:53 AM · Restricted Project
nhaehnle committed rGc0cdd22c72fa: Introduce CfgTraits abstraction (authored by nhaehnle).
Introduce CfgTraits abstraction
Oct 20 2020, 4:51 AM
nhaehnle closed D83088: Introduce CfgTraits abstraction.
Oct 20 2020, 4:51 AM · Restricted Project, Restricted Project, Restricted Project

Oct 19 2020

nhaehnle added a comment to D89610: AMDGPU: Fix missing read/writelane cases to skip with exec=0.

Adding writelane absolutely makes sense to me, but:

Additionally, since we emit the final encoded instructions, this wasn't really matching readlane either.

:(

We have a bunch of places where we _always_ use real instructions, and presumably those are fine. But if we let real instructions creep into places where we usually use pseudos, we end up making the backend less efficient because over time, everybody will have to check for pseudos and reals (like in this particular case here). Can we instead add some sort of check that those real instructions aren't used during CodeGen? (The MI verifier should be able to check for this, for example.)

There is some reason we use the encoded versions for readlane/writelane that I never remember what it is

Oct 19 2020, 10:42 AM · Restricted Project
nhaehnle added a comment to D89618: [AMDGPU] Optimize waitcnt insertion for flat memory operations.

LGTM modulo that inline question.

Oct 19 2020, 10:31 AM · Restricted Project
nhaehnle accepted D89644: [AMDGPU] Remove fix up operand from SI_ELSE.

Thanks, LGTM

Oct 19 2020, 10:21 AM · Restricted Project
nhaehnle added a comment to D89525: [amdgpu] Enhance AMDGPU AA..

This is fine for graphics.

Oct 19 2020, 10:14 AM · Restricted Project
nhaehnle added a comment to D89610: AMDGPU: Fix missing read/writelane cases to skip with exec=0.

Adding writelane absolutely makes sense to me, but:

Oct 19 2020, 10:05 AM · Restricted Project
nhaehnle added a comment to D89595: [AMDGPU] Update AMDGPUUsage.rst.

Duplicate of D89596?

Oct 19 2020, 10:01 AM · Restricted Project
nhaehnle added inline comments to D88081: [AMDGPU] Move WQM Pass after MI Scheduler.
Oct 19 2020, 9:57 AM · Restricted Project
nhaehnle added a comment to D18714: Add writeonly IR attribute.

That was very long ago. My best guess would be "oversight".

Oct 19 2020, 9:45 AM · Restricted Project