Page MenuHomePhabricator

nhaehnle (Nicolai Hähnle)
User

Projects

User does not belong to any projects.

User Details

User Since
Oct 9 2015, 4:06 AM (258 w, 5 d)

Recent Activity

Today

nhaehnle added a comment to D87674: [AMDGPU] Insert waitcnt after returning from call.

The revert also highlights the problem that we don't have representative test in llvm/test :(

Thu, Sep 24, 12:22 AM · Restricted Project

Yesterday

nhaehnle added a comment to D87674: [AMDGPU] Insert waitcnt after returning from call.

Yes, this commit is incorrect. It completely breaks code linking in Mesa OpenGL. s_waitcnt is required at the end of all global functions that return values.

Please revert. @nhaehnle

I don't understand why would it fail. This patch just moves s_waitcnt to the caller so they would be executed anyway. I think I am missing something. It would be helpful to root cause if we can isolate to a small test case.

Shader returns aren't real returns and the "caller" doesn't wait

I see. So how should this be implemented? May be we conditionalize this patch just for compute?

Wed, Sep 23, 10:02 AM · Restricted Project
nhaehnle added inline comments to D85603: IR: Add convergence control operand bundle and intrinsics.
Wed, Sep 23, 9:27 AM · Restricted Project

Tue, Sep 15

nhaehnle added inline comments to D85603: IR: Add convergence control operand bundle and intrinsics.
Tue, Sep 15, 6:08 AM · Restricted Project

Mon, Sep 7

nhaehnle added inline comments to D85604: SimplifyCFG: prevent certain transforms on convergent operations.
Mon, Sep 7, 7:59 AM · Restricted Project
nhaehnle added inline comments to D85603: IR: Add convergence control operand bundle and intrinsics.
Mon, Sep 7, 7:49 AM · Restricted Project
nhaehnle added inline comments to D83088: Introduce CfgTraits abstraction.
Mon, Sep 7, 7:38 AM · Restricted Project, Restricted Project, Restricted Project

Thu, Aug 27

nhaehnle added a comment to D86154: AMDGPU: Add llvm.amdgcn.{read,readfirst,write}lane2 intrinsics with type overloads.

ReplaceNodeResults expects the result type to be changed in semi-magical ways during vector type legalization, which is non-obvious since the method can be called from different places. I think it *could* be made to work with a lot of patience, but it's really a bad interface -- and besides, by doing it in IR we reduce code duplication between SelectionDAG and GlobalISel, which is an added benefit IMO.

Well the globalisel handling should be much simpler. We have a lot of stuff that's randomly handled in the IR to work around the DAG which long term should be moved where it belongs in codegen

Thu, Aug 27, 1:15 PM · Restricted Project, Restricted Project

Aug 24 2020

nhaehnle updated the diff for D83088: Introduce CfgTraits abstraction.
  • cleanup operators on CfgOpaqueType
  • address other review comments
Aug 24 2020, 10:29 AM · Restricted Project, Restricted Project, Restricted Project

Aug 23 2020

nhaehnle added a comment to D83088: Introduce CfgTraits abstraction.

The most immediate problem is divergence analysis, which is extremely complex and difficult to get right. If I had tried to fight the accidental complexity that comes with attempting to write such an algorithm as C++ templates in addition to the inherent complexity of the algorithm at the same time, I'm not sure I would have been able to produce anything workable at all.

Frankly, I suspect that our dominator tree implementation also suffer because of this, though at least dominator trees are much more well studied in the academic literature, so that helps keep the inherent complexity under control.

I'm totally open to discussing making APIs more usable, for sure - though I'm thinking it's likely a concept (like containers in the C++ standard library) might be the better direction.

Perhaps some code samples showing how one would interact (probably not whole algorithms - maybe something simple like generating a dot diagram for a graph) with these things given different APIs (traits, concepts, and runtime polymorphism) - and implementations of each kind too.

Take a look here for example: https://github.com/nhaehnle/llvm-project/blob/715450fa7f968ceefaf9c3b04b47066866c97206/llvm/lib/Analysis/GenericConvergenceUtils.cpp#L499 -- this is obviously still fairly simple, but it's an example of printing out the results of an analysis in a way that's generic over the underlying CFG and SSA form.

I'm having trouble following this example - I'm not sure what the CfgPrinter abstraction is/why it's first-class, and why this "print" function is calling what look like mutation operations like "appendBlocks". I guess perhaps the question is - what's it printing from and what's it printing to?

Ah, I see, the "append" functions are accessors, of a sort. Returning a container might be more clear than using an out parameter - alternatively, a functor parameter (ala std::for_each) that is called for each element, that can then be used to populate an existing container if desired, or to do immediate processing without the need for an intermediate container.

Aug 23 2020, 11:57 PM · Restricted Project, Restricted Project, Restricted Project

Aug 21 2020

nhaehnle committed rG196e6f9f1893: Replace TableGen range piece punctuator with '...' (authored by Paul-C-Anagnostopoulos).
Replace TableGen range piece punctuator with '...'
Aug 21 2020, 2:34 PM
nhaehnle closed D85585: Replace TableGen range piece punctuator with '..'.
Aug 21 2020, 2:34 PM · Restricted Project
nhaehnle added a comment to D85852: Fix two bugs in TGParser::ParseValue.

Hmm, forgot to adjust the author field to you for that commit. Sorry about that, I'm doing it for the other ones.

Aug 21 2020, 2:25 PM · Restricted Project
nhaehnle committed rG17cd34409a3a: Fix two bugs in TGParser::ParseValue (authored by nhaehnle).
Fix two bugs in TGParser::ParseValue
Aug 21 2020, 2:20 PM
nhaehnle closed D85852: Fix two bugs in TGParser::ParseValue.
Aug 21 2020, 2:20 PM · Restricted Project
nhaehnle committed rGe0c01e6cb071: New TableGen Programmer's Reference document (authored by Paul-C-Anagnostopoulos).
New TableGen Programmer's Reference document
Aug 21 2020, 2:19 PM
nhaehnle closed D85838: New TableGen Programmer's Reference document.
Aug 21 2020, 2:18 PM · Restricted Project
nhaehnle committed rGb37db11d95d8: MachineSSAUpdater: Allow initialization with just a register class (authored by nhaehnle).
MachineSSAUpdater: Allow initialization with just a register class
Aug 21 2020, 2:05 PM
nhaehnle closed D85602: MachineSSAUpdater: Allow initialization with just a register class.
Aug 21 2020, 2:05 PM · Restricted Project

Aug 20 2020

nhaehnle added a comment to D86154: AMDGPU: Add llvm.amdgcn.{read,readfirst,write}lane2 intrinsics with type overloads.

Note that part of my motivation here over D84639 is to support more general types on the lane intrinsics, since they also express some semantic content which would be interesting to be able to express e.g. on descriptors. I wasn't able to bend the SelectionDAG type legalization to my will, so that's why I instead "legalize" the intrinsics in the AMDGPUCodeGenPrepare pass.

Don't you just need to handle this in ReplaceNodeResults the same way?

Aug 20 2020, 1:46 PM · Restricted Project, Restricted Project
nhaehnle updated the diff for D86154: AMDGPU: Add llvm.amdgcn.{read,readfirst,write}lane2 intrinsics with type overloads.

Don't duplicate the intrinsics. Rely on D86317 to reduce the pain of this
change caused to downstream users.

Aug 20 2020, 1:44 PM · Restricted Project, Restricted Project
nhaehnle requested review of D86317: IRBuilder: add CreateIntrinsicByType method.
Aug 20 2020, 1:43 PM · Restricted Project
nhaehnle added a comment to D83088: Introduce CfgTraits abstraction.

Based on your description and the DomTree patches, if I understand correctly, the primary motivation is to facilitate writing CFG-representation-agnostic algorithms/analyses (e.g., dominators, divergence, convergence analyses), such that you can later lift the results back to the representation-aware types? If that's correct, I support the overall goal. Having spent probably ~weeks wrangling with domtree templates, this sounds like something that could simplify life a lot and potentially cut down on compilation times & sizes of llvm binaries.

Aug 20 2020, 7:55 AM · Restricted Project, Restricted Project, Restricted Project
nhaehnle updated the diff for D85603: IR: Add convergence control operand bundle and intrinsics.

clang-format fixes

Aug 20 2020, 7:43 AM · Restricted Project
nhaehnle updated the diff for D85603: IR: Add convergence control operand bundle and intrinsics.
  • tighten the static rules about cycles: there was a gap in the exact phrasing if two loop heart intrinsics in a cycle use _different_ convergence tokens
  • add verifier checks and corresponding tests for the static rules
Aug 20 2020, 7:42 AM · Restricted Project

Aug 19 2020

nhaehnle added a comment to D83088: Introduce CfgTraits abstraction.

But I guess coming back to the original/broader design: What problems is this intended to solve? The inability to write non-template algorithms over graphs? What cost does that come with? Are there algorithms that are a bit too complicated/unwieldy when done as templates?
If it's specifically the static/dynamic dispatch issue - I'm not sure the type erasure and runtime overhead may be worth the tradeoff here, though if it is - it'd be good to keep the non-dynamic version common, rather than now having GraphTraits and CfgTraits done a bit differently, etc.

It's not just over graphs, but taking SSA values into account as well -- that is the key distinction between GraphTraits and CfgTraits.

Not sure I follow - could you give an example of a graph where the GraphTraits concept of the Graph and the CfgTraits concept of the graph (or, perhaps more importantly - features of the graph/API surface area/properties you can expose through the CFG API/concept/thing but not through GraphTraits?

Aug 19 2020, 2:11 PM · Restricted Project, Restricted Project, Restricted Project

Aug 18 2020

nhaehnle added a comment to D83088: Introduce CfgTraits abstraction.

Not sure that's the best place to be designing this fairly integral and complicated piece of infrastructure from, but hoping we can find some good places/solutions/etc.

Aug 18 2020, 10:40 AM · Restricted Project, Restricted Project, Restricted Project
nhaehnle added a comment to D86154: AMDGPU: Add llvm.amdgcn.{read,readfirst,write}lane2 intrinsics with type overloads.

Note that part of my motivation here over D84639 is to support more general types on the lane intrinsics, since they also express some semantic content which would be interesting to be able to express e.g. on descriptors. I wasn't able to bend the SelectionDAG type legalization to my will, so that's why I instead "legalize" the intrinsics in the AMDGPUCodeGenPrepare pass.

Aug 18 2020, 10:26 AM · Restricted Project, Restricted Project
nhaehnle added a comment to D84639: AMDGPU: Add type mangling to llvm.amdgcn.readfirstlane.

D86154 is an attempt to do essentially this, on all lane intrinsics simultaneously, giving them a new name to avoid a flag day change.

Aug 18 2020, 10:24 AM · Restricted Project
nhaehnle added a reviewer for D86154: AMDGPU: Add llvm.amdgcn.{read,readfirst,write}lane2 intrinsics with type overloads: foad.
Aug 18 2020, 10:24 AM · Restricted Project, Restricted Project
nhaehnle requested review of D86154: AMDGPU: Add llvm.amdgcn.{read,readfirst,write}lane2 intrinsics with type overloads.
Aug 18 2020, 10:23 AM · Restricted Project, Restricted Project

Aug 17 2020

nhaehnle added a comment to D85838: New TableGen Programmer's Reference document.

First of all, thank you for doing this. I haven't gone through it fully yet. The main thing that is missing in my opinion is to remove stuff from the existing files. You include here a listing of TableGen syntax -- that's fine, but it's now redundant with what's in TableGen/LangRef.rst, and this redundancy is a very bad place for us to be in.

You are most welcome. My plan was to get this committed first. Note that this includes an update to the TableGen Overview to refer to this document rather than the two old ones. A search turns up no other references to the two old documents in the Clang or LLVM documentation. Then once this is committed, we can delete the two old files. (Is that what we do, delete them?)

Aug 17 2020, 11:35 AM · Restricted Project
nhaehnle added a comment to D85852: Fix two bugs in TGParser::ParseValue.

Do I need to reformat the code as Lint suggests? Most of the if (!xxx) return ... tests in the code are on one line.

Aug 17 2020, 11:31 AM · Restricted Project
nhaehnle added a comment to D85585: Replace TableGen range piece punctuator with '..'.

I like this. I agree with David that the goal should be to remove existing uses of the old syntax, but I think it's fine to do that in a separate patch. One small nit, apart from that LGTM.

Aug 17 2020, 11:27 AM · Restricted Project
nhaehnle accepted D85853: Eliminate Sphinx warnings from Tablegen/LangRef.rst.

LGTM

Aug 17 2020, 11:25 AM · Restricted Project
nhaehnle added inline comments to D85838: New TableGen Programmer's Reference document.
Aug 17 2020, 11:24 AM · Restricted Project
nhaehnle added a comment to D85838: New TableGen Programmer's Reference document.

First of all, thank you for doing this. I haven't gone through it fully yet. The main thing that is missing in my opinion is to remove stuff from the existing files. You include here a listing of TableGen syntax -- that's fine, but it's now redundant with what's in TableGen/LangRef.rst, and this redundancy is a very bad place for us to be in.

Aug 17 2020, 11:09 AM · Restricted Project

Aug 14 2020

nhaehnle added inline comments to D85609: Transforms: add ConvergenceControlHeuristic pass.
Aug 14 2020, 11:39 AM · Restricted Project
nhaehnle added inline comments to D85607: CfgTraits: add CfgInstructionRef.
Aug 14 2020, 11:38 AM · Restricted Project, Restricted Project, Restricted Project
nhaehnle added inline comments to D85606: Inliner: handle the convergence control operand bundle.
Aug 14 2020, 11:38 AM · Restricted Project
nhaehnle updated the diff for D85609: Transforms: add ConvergenceControlHeuristic pass.
  • add an inline assembly test
  • simplification suggest by a review comment
Aug 14 2020, 11:37 AM · Restricted Project
nhaehnle updated the diff for D85607: CfgTraits: add CfgInstructionRef.
  • use llvm_unreachable
Aug 14 2020, 11:37 AM · Restricted Project, Restricted Project, Restricted Project
nhaehnle updated the diff for D85606: Inliner: handle the convergence control operand bundle.
  • add another test case
Aug 14 2020, 11:36 AM · Restricted Project
nhaehnle updated the diff for D85604: SimplifyCFG: prevent certain transforms on convergent operations.
  • sink use getCalledFunction() below convergence check in ValueTracking
Aug 14 2020, 11:36 AM · Restricted Project
nhaehnle updated the diff for D85603: IR: Add convergence control operand bundle and intrinsics.

Add more language about loops

Aug 14 2020, 11:35 AM · Restricted Project
nhaehnle added a comment to D84639: AMDGPU: Add type mangling to llvm.amdgcn.readfirstlane.

On second thought, can we make this not be a flag-day change?

Aug 14 2020, 11:30 AM · Restricted Project
nhaehnle added inline comments to D85604: SimplifyCFG: prevent certain transforms on convergent operations.
Aug 14 2020, 11:14 AM · Restricted Project
nhaehnle added a comment to D83088: Introduce CfgTraits abstraction.

This seems like a strange hybrid between a static-polymorphism (with traits) and dynamic polymorphism (with base classes/virtual functions). Could this more readily be just one or the other? (sounds like you're leaning towards dynamic polymorphism)

No, it's very much this way on purpose. The idea is to support the same set of functionality as much as possible in both static and dynamic polymorphism.

Could it be implemented statically as a primary interface, with a dynamic wrapper? (eg: a base class, then a derived class template that takes the static CFG type to wrap into the dynamic type) keeping the two concepts more clearly separated?

That is how it is implemented. CfgTraits is the primary static interface, and then CfgInterface / CfgInterfaceImpl is the dynamic wrapper.

Ah, fair enough. The inheritance details in the traits class confused me a bit/I had a hard time following, with all the features being in the one patch. Might be easier separately, but not sure.

Would it be possible for this not to use traits - I know @asbirlea and I had trouble with some things using GraphTraits owing to the traits API. An alternative would be to describe a CFGGraph concept (same as a standard container concept, for instance) - where there is a concrete graph object and that object is queried for things like nodes, edges, etc. (actually one of the significant things we tripped over was the API choice to navigate edges from a node itself without any extra state - which meant nodes/edge iteration had to carry state (potentially pointers back to the graph, etc) to be able to manifest their edges - trait or concept could both address this by, for traits, passing the graph as well as the node when querying the trait for edges, or for a concept passing the node back to the graph to query for edges).

Aug 14 2020, 11:04 AM · Restricted Project, Restricted Project, Restricted Project
nhaehnle accepted D85667: Reset PAL metadata when AMDGPU traget stream finishes.

LGTM minus something rather obvious.

Aug 14 2020, 10:19 AM · Restricted Project

Aug 13 2020

nhaehnle added inline comments to D85603: IR: Add convergence control operand bundle and intrinsics.
Aug 13 2020, 4:02 AM · Restricted Project
nhaehnle updated the diff for D85603: IR: Add convergence control operand bundle and intrinsics.

Typos and yet slightly more detail.

Aug 13 2020, 12:32 AM · Restricted Project
nhaehnle added inline comments to D85603: IR: Add convergence control operand bundle and intrinsics.
Aug 13 2020, 12:01 AM · Restricted Project
nhaehnle updated the diff for D85603: IR: Add convergence control operand bundle and intrinsics.

Actually submit all the changes that I thought I had submitted
two days ago.

Aug 13 2020, 12:00 AM · Restricted Project

Aug 12 2020

nhaehnle added a comment to D83088: Introduce CfgTraits abstraction.

This seems like a strange hybrid between a static-polymorphism (with traits) and dynamic polymorphism (with base classes/virtual functions). Could this more readily be just one or the other? (sounds like you're leaning towards dynamic polymorphism)

No, it's very much this way on purpose. The idea is to support the same set of functionality as much as possible in both static and dynamic polymorphism.

Could it be implemented statically as a primary interface, with a dynamic wrapper? (eg: a base class, then a derived class template that takes the static CFG type to wrap into the dynamic type) keeping the two concepts more clearly separated?

Aug 12 2020, 12:49 PM · Restricted Project, Restricted Project, Restricted Project
nhaehnle added inline comments to D85603: IR: Add convergence control operand bundle and intrinsics.
Aug 12 2020, 12:48 PM · Restricted Project
nhaehnle added a comment to D83088: Introduce CfgTraits abstraction.

This seems like a strange hybrid between a static-polymorphism (with traits) and dynamic polymorphism (with base classes/virtual functions). Could this more readily be just one or the other? (sounds like you're leaning towards dynamic polymorphism)

Aug 12 2020, 12:27 PM · Restricted Project, Restricted Project, Restricted Project

Aug 11 2020

nhaehnle added inline comments to D85603: IR: Add convergence control operand bundle and intrinsics.
Aug 11 2020, 8:08 AM · Restricted Project
nhaehnle updated the diff for D85603: IR: Add convergence control operand bundle and intrinsics.

With this change, I've edited the documents in a way where I hope all
comments have been addressed.

Aug 11 2020, 8:07 AM · Restricted Project

Aug 10 2020

nhaehnle updated the diff for D85603: IR: Add convergence control operand bundle and intrinsics.

Address a first batch of review comments. I got almost as far as the final examples section.

Aug 10 2020, 8:54 AM · Restricted Project
nhaehnle added inline comments to D85603: IR: Add convergence control operand bundle and intrinsics.
Aug 10 2020, 8:52 AM · Restricted Project

Aug 9 2020

nhaehnle abandoned D68994: [RFC] Redefine `convergent` in terms of dynamic instances.

I am abandoning this change in favor of D85603, which is the next and hopefully final iteration on this.

Aug 9 2020, 7:47 AM · Restricted Project
nhaehnle requested review of D85609: Transforms: add ConvergenceControlHeuristic pass.
Aug 9 2020, 7:45 AM · Restricted Project
nhaehnle requested review of D85608: Analysis: Add GenericConvergenceUtils and related passes.
Aug 9 2020, 7:43 AM · Restricted Project
nhaehnle requested review of D85607: CfgTraits: add CfgInstructionRef.
Aug 9 2020, 7:31 AM · Restricted Project, Restricted Project, Restricted Project
nhaehnle requested review of D85606: Inliner: handle the convergence control operand bundle.
Aug 9 2020, 7:29 AM · Restricted Project
nhaehnle requested review of D85605: LoopUnroll: adjust for new `convergent` semantics.
Aug 9 2020, 7:28 AM · Restricted Project
nhaehnle requested review of D85604: SimplifyCFG: prevent certain transforms on convergent operations.
Aug 9 2020, 7:24 AM · Restricted Project
nhaehnle requested review of D85603: IR: Add convergence control operand bundle and intrinsics.
Aug 9 2020, 7:23 AM · Restricted Project
nhaehnle requested review of D85602: MachineSSAUpdater: Allow initialization with just a register class.
Aug 9 2020, 7:17 AM · Restricted Project
nhaehnle updated the diff for D83094: Analysis: Add a GenericCycleInfo analysis.
v7:
- add llvm::Cycle alias
- add GenericCycleBase::isRoot and fix ::isNaturalLoop
- fix a number of bugs in GenericCycleInfoBase::extendCycle
  and add a `transferredBlocks` argument
- use a "check" macro in validateTree so that a single breakpoint can
  catch all validation errors
Aug 9 2020, 7:15 AM · Restricted Project
nhaehnle updated the diff for D83088: Introduce CfgTraits abstraction.
v7:
- std::forward fix in wrapping_iterator
- fix typos
Aug 9 2020, 7:13 AM · Restricted Project, Restricted Project, Restricted Project

Aug 6 2020

nhaehnle accepted D85207: Fix 64-bit copy to SCC.

LGTM

Aug 6 2020, 1:39 AM · Restricted Project
nhaehnle accepted D84314: AMDGPU/GlobalISel: Fix assert on copy to vcc.

LGTM

Aug 6 2020, 1:33 AM · Restricted Project
nhaehnle added a comment to D84779: [AMDGPU] Add amdgpu specific loop threshold metadata.

Do you have a pointer to how this is planned to be used in the frontend? Why is the per-function attribute not enough?

Aug 6 2020, 1:22 AM · Restricted Project

Aug 5 2020

nhaehnle added a comment to D84639: AMDGPU: Add type mangling to llvm.amdgcn.readfirstlane.

I think this makes sense, but it is a flag-day change for both Mesa and LLPC. Intrinsics created with incorrect mangling don't fall over completely (as evidenced by the fact that your change *doesn't* change the test files), but certain things like attributes tend to not work quite correctly anymore.

Aug 5 2020, 9:02 AM · Restricted Project

Jul 28 2020

nhaehnle added a comment to D68994: [RFC] Redefine `convergent` in terms of dynamic instances.

Now that I'm actually editing this text, I realize that this is an older version than I thought. I've given up on "join" and "point" as being too noisy in the IR in favor of something that more explicitly tracks loop structure, which is the real point of concern at least for what we care about.

Jul 28 2020, 2:31 AM · Restricted Project

Jul 27 2020

nhaehnle accepted D84489: Use llvm::is_contained where appropriate (NFC).

This is a nice cleanup, thanks. LGTM.

Jul 27 2020, 2:14 AM · Restricted Project
nhaehnle abandoned D22199: AMDGPU: Leave WQM earlier before VMEM loads.
Jul 27 2020, 2:12 AM
nhaehnle accepted D84313: AMDGPU: global_atomic_csub is not always dereferenceable.

LGTM

Jul 27 2020, 2:10 AM · Restricted Project
nhaehnle added a comment to D84413: [DA][SDA] SyncDependenceAnalysis re-write.

It's interesting that you end up using a modified post-order where loops are kept contiguous, I found the same thing helpful in my work on control flow lowering in AMDGPU.

Jul 27 2020, 2:05 AM · Restricted Project

Jul 24 2020

Herald added a reviewer for D68994: [RFC] Redefine `convergent` in terms of dynamic instances: jdoerfert.

Can you expand on the issues that are present here?

Jul 24 2020, 9:53 AM · Restricted Project
nhaehnle committed rG5934df0c9abe: MachineBasicBlock: add printName method (authored by nhaehnle).
MachineBasicBlock: add printName method
Jul 24 2020, 9:19 AM
nhaehnle closed D83253: MachineBasicBlock: add printName method.
Jul 24 2020, 9:19 AM · Restricted Project
nhaehnle retitled D83092: DomTree: Add climbLhsUntilSiblings helper from DomTree: Add findSiblingOfUncle helper to DomTree: Add climbLhsUntilSiblings helper.
Jul 24 2020, 9:14 AM · Restricted Project
nhaehnle updated the diff for D83094: Analysis: Add a GenericCycleInfo analysis.
v6:
- rename {to,from}Generic -> {wrap,unwrap}Ref
- validateTree prints errors and returns bool now instead of
  using assertions
Jul 24 2020, 9:14 AM · Restricted Project
nhaehnle updated the diff for D83092: DomTree: Add climbLhsUntilSiblings helper.
v6:
- rename to climbLhsUntilSiblings and add some explanatory ASCII art
  and comments for why this operation could be interesting
Jul 24 2020, 9:13 AM · Restricted Project
nhaehnle updated the diff for D83089: DomTree: Extract (mostly) read-only logic into type-erased base classes.
v6:
- style change: iDom -> idom; it's arguable whether this is really
  invalid, since it is actually standard camelCase, but clang-tidy
  complains about it so... *shrug*
- rename {to,from}Generic -> {wrap,unwrap}Ref
Jul 24 2020, 9:13 AM · Restricted Project
nhaehnle updated the diff for D83088: Introduce CfgTraits abstraction.
v6:
- implement predecessors/successors for all CfgTraits implementations
- fix error in unwrapRange
- rename toGeneric/fromGeneric into wrapRef/unwrapRef to have naming
  that is consistent with {wrap,unwrap}{Iterator,Range}
- use getVRegDef instead of getUniqueVRegDef
Jul 24 2020, 9:12 AM · Restricted Project, Restricted Project, Restricted Project
nhaehnle added inline comments to D83094: Analysis: Add a GenericCycleInfo analysis.
Jul 24 2020, 8:58 AM · Restricted Project
nhaehnle added inline comments to D83088: Introduce CfgTraits abstraction.
Jul 24 2020, 8:40 AM · Restricted Project, Restricted Project, Restricted Project
nhaehnle accepted D84464: AMDGPU: Skip other terminators before inserting s_cbranch_exec[n]z.

It's hard to say whether this is really correct given how underdefined the control flow pseudos are, but it seems like a pretty reasonable change to me. Assuming you've properly tested this change, LGTM

Jul 24 2020, 8:36 AM · Restricted Project
nhaehnle added a comment to D84477: AMDGPU: Don't assume there is only one terminator copy.

Update the comment above removeTerminatorBit. LGTM otherwise.

Jul 24 2020, 8:30 AM · Restricted Project
nhaehnle accepted D84478: AMDGPU: Optimize copies to exec with other insts after exec def.

LGTM and thanks, I think this kind of change may help with the codegen in my new control flow lowering.

Jul 24 2020, 8:28 AM · Restricted Project
nhaehnle added a comment to D84448: [AMDGPU] Make generating cache invalidating instructions optional.

LGTM with the option name fixed, and please adjust the variable name also (amdgcn-skip-cache-invalidations --> AmdgcnSkipCacheInvalidations).

Jul 24 2020, 7:16 AM · Restricted Project
nhaehnle added inline comments to D84448: [AMDGPU] Make generating cache invalidating instructions optional.
Jul 24 2020, 6:44 AM · Restricted Project

Jul 21 2020

nhaehnle accepted D81728: [InstCombine] Add target-specific inst combining.

This has had a month of good review that has been addressed, I'd say it's good to go.

Jul 21 2020, 10:41 AM · Restricted Project, Restricted Project, Restricted Project

Jul 17 2020

nhaehnle added a comment to D80249: CodeGen: Don't lazily construct MachineFunctionInfo.

Do you really need to change the signature of MachineFunctionInfo::create? That seems orthogonal to the problem you're trying to solve, or perhaps a redundant solution.

The problem is: due to lazy construction and being able to refer to the MachineFunction, the state of the MachineFunctionInfo can have surprising accidental differences depending on when exactly it was created. There are two possible solutions:

  1. Prevent the MachineFunctionInfo from accessing the MachineFunction.
  2. Always create the MachineFunctionInfo in the same place.

This change does both, but either solution should be sufficient, and I personally find #1 counter-intuitive: it seems to me that the additional target-specific info about an object should be allowed to access that object.

You can't really do anything with the empty MachineFunction on construction. At the construction point, there isn't any information to access and it would always be a mistake to inspect it or make any construction decisions based on it (it would also be bad for the constructor to modify the MachineFunction).

Jul 17 2020, 10:01 AM · Restricted Project

Jul 16 2020

nhaehnle accepted D78285: [TableGen] Report an error instead of asserting.

LGTM

Jul 16 2020, 2:04 PM · Restricted Project
nhaehnle added inline comments to D83674: [AMDGPU] Calculate minimum allowed occupancy based on threads per lane.
Jul 16 2020, 1:11 PM · Restricted Project
nhaehnle added a comment to D80249: CodeGen: Don't lazily construct MachineFunctionInfo.

Do you really need to change the signature of MachineFunctionInfo::create? That seems orthogonal to the problem you're trying to solve, or perhaps a redundant solution.

Jul 16 2020, 1:10 PM · Restricted Project