This is an archive of the discontinued LLVM Phabricator instance.

[GlobalISel] Don't skip adding predicate matcher
ClosedPublic

Authored by madhur13490 on Jul 2 2020, 4:04 AM.

Details

Summary

This patch fixes a bug which skipped
adding predicate matcher for a pattern in many cases.
For example, if predicate is Load and
its memoryVT is non-null then the loop
continues and never reaches to the end which
adds the predicate matcher. This patch moves the
matcher addition to the top of the loop
so that it gets added regardless of contextual checks
later in the loop.
Other way to fix this issue is to remove all "continue" statements
in checks and let the loop continue till end.

Diff Detail

Event Timeline

madhur13490 created this revision.Jul 2 2020, 4:04 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 2 2020, 4:04 AM
madhur13490 retitled this revision from [GlobalISel] Don't skip adding predicate code to [GlobalISel] Don't skip adding predicate macther.Jul 2 2020, 4:05 AM
madhur13490 retitled this revision from [GlobalISel] Don't skip adding predicate macther to [GlobalISel] Don't skip adding predicate matcher.
madhur13490 abandoned this revision.Jul 2 2020, 4:23 AM
madhur13490 edited the summary of this revision. (Show Details)Jul 2 2020, 5:28 AM
madhur13490 reclaimed this revision.Jul 2 2020, 5:45 AM

Fix test failures

Could you provide a more concrete example that goes wrong without this change? I believe that could help understanding, would be good to add to the commit message for this.

arsenm added a comment.Jul 6 2020, 7:09 AM

I would expect the existing tablegen tests to break from this and need updating?

llvm/utils/TableGen/GlobalISelEmitter.cpp
3813

I think this should remain as the last predicate type added to the matcher, after everything else here. The ordering does matter (see D82331)

Could you add a test case or point us at a example of an existing test that goes wrong? The description in the commit message is unlikely to be the full story as we already cover loads with non-null MemoryVT's. My best guess is you are attempting to use builtin predicates and custom predicates together. I don't see a reason why that shouldn't be allowed but it's not something that was intended as the goal was aiming to fully remove the custom C++ from the tablegen headers so that tablegen could do some transformations on sextload/zextload and similar to fix DAGISel vs GlobalISel modelling mismatches.

llvm/utils/TableGen/GlobalISelEmitter.cpp
3686–3696

These two cases covered all of the loads with predicates that were moved to built-in predicates.

Could you add a test case or point us at a example of an existing test that goes wrong? The description in the commit message is unlikely to be the full story as we already cover loads with non-null MemoryVT's. My best guess is you are attempting to use builtin predicates and custom predicates together. I don't see a reason why that shouldn't be allowed but it's not something that was intended as the goal was aiming to fully remove the custom C++ from the tablegen headers so that tablegen could do some transformations on sextload/zextload and similar to fix DAGISel vs GlobalISel modelling mismatches.

I think the problem is broader than just combining custom predicates and builtin. The emitter here implicitly assumes all of these builtin PatFrag predicates are used exactly as the hierarchy used to define the default/generic load/store patterns. However, the PatFrags are much more free form and you can define a patfrag that combines multiple of these predicates in the same "layer" of the load/store hierarchy. The AMDGPU patterns redefine the entire set of load/store patterns, and in some cases it makes sense to combine all the predicates at once. I had to somewhat artificially add new layers of PatFrags to only apply a single predicate at a time. This also isn't consistently applied (for example, it does work to combine the AddressSpaces predicate and alignment)

madhur13490 marked an inline comment as done.Jul 13 2020, 9:13 AM

Could you add a test case or point us at a example of an existing test that goes wrong? The description in the commit message is unlikely to be the full story as we already cover loads with non-null MemoryVT's. My best guess is you are attempting to use builtin predicates and custom predicates together. I don't see a reason why that shouldn't be allowed but it's not something that was intended as the goal was aiming to fully remove the custom C++ from the tablegen headers so that tablegen could do some transformations on sextload/zextload and similar to fix DAGISel vs GlobalISel modelling mismatches.

I think the problem is broader than just combining custom predicates and builtin. The emitter here implicitly assumes all of these builtin PatFrag predicates are used exactly as the hierarchy used to define the default/generic load/store patterns. However, the PatFrags are much more free form and you can define a patfrag that combines multiple of these predicates in the same "layer" of the load/store hierarchy. The AMDGPU patterns redefine the entire set of load/store patterns, and in some cases it makes sense to combine all the predicates at once. I had to somewhat artificially add new layers of PatFrags to only apply a single predicate at a time. This also isn't consistently applied (for example, it does work to combine the AddressSpaces predicate and alignment)

Yes in addition to what Matt explained here, I believe the custom predicates should be allowed to use on any patterns because we cannot always guarantee that everything can be covered in other checks. Sometimes, we may want to do complicated checks for which custom predicates makes more sense that adding a builtin for each sub-checks. FWIW, the function needs refactoring because it does checks like isAtomic() and isLoad() several times. We can really do these checks once and then add an appropriate checker. The function seems tailored to a class of instructions but the name of the function seems generic.

I have added a test.

llvm/utils/TableGen/GlobalISelEmitter.cpp
3813

Well, then we have to remove all "continue" statements. Are we fine with it? I always thought that we'll use custom predicates for things which are *not* covered in other checks. I am fine with keeping it at the end, just looking for consequences.

rebase + add test

I would expect the existing tablegen tests to break from this and need updating?

No failures.

Could you add a test case or point us at a example of an existing test that goes wrong? The description in the commit message is unlikely to be the full story as we already cover loads with non-null MemoryVT's. My best guess is you are attempting to use builtin predicates and custom predicates together. I don't see a reason why that shouldn't be allowed but it's not something that was intended as the goal was aiming to fully remove the custom C++ from the tablegen headers so that tablegen could do some transformations on sextload/zextload and similar to fix DAGISel vs GlobalISel modelling mismatches.

I think the problem is broader than just combining custom predicates and builtin. The emitter here implicitly assumes all of these builtin PatFrag predicates are used exactly as the hierarchy used to define the default/generic load/store patterns.

That's because the builtin ones implement the predicates from the PatFrag hierarchy that was in the Target.td loads. Those PatFrags resulted in three distinct predicates applied to the same node on most load/stores. That's what I mean when I say mixing builtin/custom wasn't intended. The builtins were only intended to replace that specific hierarchy. Once we had C++ predicates there wasn't an apparent need to extend the builtins any further.

However, the PatFrags are much more free form and you can define a patfrag that combines multiple of these predicates in the same "layer" of the load/store hierarchy. The AMDGPU patterns redefine the entire set of load/store patterns, and in some cases it makes sense to combine all the predicates at once. I had to somewhat artificially add new layers of PatFrags to only apply a single predicate at a time.

Yep, I don't see a reason why you shouldn't be allowed to use these in different ways. I'm just saying that wasn't intended when they were added. The expectation was that other predicates would continue to use C++ predicates as they did before. There are some benefits to factoring out the portions of the C++ that are equivalent to these predicates though (the generated state machine can make use of this information when it's optimized) which is partly why I think it ought to be allowed.

This also isn't consistently applied (for example, it does work to combine the AddressSpaces predicate and alignment)

IIRC, that's because there was originally a C++ predicate that checked both together. I think it was target specific rather than Target.td though

Could you add a test case or point us at a example of an existing test that goes wrong? The description in the commit message is unlikely to be the full story as we already cover loads with non-null MemoryVT's. My best guess is you are attempting to use builtin predicates and custom predicates together. I don't see a reason why that shouldn't be allowed but it's not something that was intended as the goal was aiming to fully remove the custom C++ from the tablegen headers so that tablegen could do some transformations on sextload/zextload and similar to fix DAGISel vs GlobalISel modelling mismatches.

I think the problem is broader than just combining custom predicates and builtin. The emitter here implicitly assumes all of these builtin PatFrag predicates are used exactly as the hierarchy used to define the default/generic load/store patterns. However, the PatFrags are much more free form and you can define a patfrag that combines multiple of these predicates in the same "layer" of the load/store hierarchy. The AMDGPU patterns redefine the entire set of load/store patterns, and in some cases it makes sense to combine all the predicates at once. I had to somewhat artificially add new layers of PatFrags to only apply a single predicate at a time. This also isn't consistently applied (for example, it does work to combine the AddressSpaces predicate and alignment)

Yes in addition to what Matt explained here, I believe the custom predicates should be allowed to use on any patterns because we cannot always guarantee that everything can be covered in other checks. Sometimes, we may want to do complicated checks for which custom predicates makes more sense that adding a builtin for each sub-checks.

I'm not sure I'm following this. Custom predicates can be used on any patterns. What I'm saying is that you can't (currently) have a single PatFrag that is simultaneously a builtin and a custom predicate. They have to be distinct PatFrags at the moment (i.e. a PatFrag with a custom predicate that matches another PatFrag with a builtin predicate) like how sextloadi32 contains sextload which contains unindexedload. These three add distinct predicates to form the three predicates on one node that we had in the internal representation of the pattern.

FWIW, the function needs refactoring because it does checks like isAtomic() and isLoad() several times. We can really do these checks once and then add an appropriate checker. The function seems tailored to a class of instructions but the name of the function seems generic.

The alternatives I tried at the time generally ended up more confusing. The current builtins code is generally structured as:

if (isBuiltin1()) {
  addBuiltin1();
  continue;
}
if (isBuiltin2()) {
  addBuiltin2();
  continue;
}

but the isBuiltinN()'s were split into subpredicates because the MemVT's made the number of specific predicates excessive and because the predicates left over after extracting that left a fair bit of code duplication.
I do think it should be hoisted into a function though.

I have added a test.

Thanks. That confirmed my understanding of the issue

llvm/test/TableGen/ContextlessPredicates.td
14–16

Thanks, this confirms what I was thinking. This PatFrag is simultaneously a builtin and a custom which wasn't really intended but should probably be allowed (even if it wasn't at the very least it should error, silently dropping it is not ok). The equivalent that fits into the original intent is:

def atomic_swap_i32 : PatFrag<(ops node:$ptr, node:$val),
                                               (atomic_swap node:$ptr, node:$val)> {
  let IsAtomic = 1;
  let MemoryVT = i32;
}
def test_atomic_op_frag : PatFrag<(ops node:$ptr, node:$val),
                                               (atomic_swap_i32 node:$ptr, node:$val)> {
  let GISelPredicateCode = [{ dbgs() <<  return !MRI.use_nodbg_empty(MI.getOperand(0).getReg()); }];
}
llvm/utils/TableGen/GlobalISelEmitter.cpp
3813

I'd be inclined to factor the builtins out into a function instead so they can return and then you resume at the code to check+add for custom predicates

dsanders added inline comments.Jul 15 2020, 6:05 PM
llvm/utils/TableGen/GlobalISelEmitter.cpp
3813

Also, just to confirm I agree it should be added after the builtins. Otherwise your C++ can't assume anything covered by the builtins.

madhur13490 marked 2 inline comments as done.Jul 16 2020, 1:49 AM
madhur13490 added inline comments.
llvm/utils/TableGen/GlobalISelEmitter.cpp
3813

Is it just for "assumption" or it would affect the functionality if custom predicates are added later?

3813

Sure, I am up for refactoring, but can you please explain a bit more, how would you that refactoring look? I was more inclined to rewriting the loop which iterates over PredicateCalls and add matcher. One of the many ways in which we can refactor the loop is like

if (isAtomic())
  if (isLoad())
   add all matchers
...

but this may change the order in which matchers are added (which brings me back to the question - is the order important?). I can further outline this in a static function.
But this is a bit of design question.

So, I'd like to see what you're imagining in a form of pseudo code :)

dsanders added inline comments.Jul 20 2020, 10:09 AM
llvm/utils/TableGen/GlobalISelEmitter.cpp
3813

Suppose the builtin checks it's a i32 non-extending load and the custom predicate wants to check the address generation. If the custom predicate is first then it needs to check the following pseudo-code:

isLoad() && isNonExtending() && isMMO32bit() && checkAddrGen(...)

whereas if the builtin is first then the first three components are tested by the builtin first so the custom predicate is:

checkAddrGen(...)
3813

Sure, I am up for refactoring, but can you please explain a bit more, how would you that refactoring look?

What I had in mind was:

  • Extract lines 3572 to 3707 of the original code into a function, passing parameters by value or by reference as appropriate
  • Replace all the continues with return

Then after those predicates are added it will resume execution from line 3709 and do the custom predicates

which brings me back to the question - is the order important?

In general, the order predicates are tested does matter. For example, testing an MMO property (e.g. the MemoryVT) without knowing that an MMO is there will lead to a crash. Specific pairs might be reorderable but unless there's a strong need to change the current order and re-verify the new order works for all targets, I'd preserve it.

madhur13490 marked an inline comment as done.Jul 21 2020, 7:12 AM
madhur13490 added inline comments.
llvm/utils/TableGen/GlobalISelEmitter.cpp
3813

With your suggested way, we'd be extracting few statements of loop in the function but for rest of the loop body, we'll still have to run the loop in caller i.e. current function. So, we'd be running the loop twice which seems time consuming to me. Line 3707 belongs to this loop. The loop I am referring to is for (const TreePredicateCall &Call : Src->getPredicateCalls()) { at line 3604 in the original code.

Are you sure about this?

I was thinking of just refactoring the loop something like

for (const TreePredicateCall &Call : Src->getPredicateCalls()) {  
     addBuiltinPredicate(); // new function
     if (Predicate.hasGISelPredicateCode()) {
      InsnMatcher.addPredicate<GenericInstructionPredicateMatcher>(Predicate);
      continue;
    }}
dsanders added inline comments.Jul 24 2020, 12:20 PM
llvm/utils/TableGen/GlobalISelEmitter.cpp
3813

I believe we're describing the same thing. My suggestion left it at:

for (const TreePredicateCall &Call : Src->getPredicateCalls()) {
  const TreePredicateFn &Predicate = Call.Fn;
  if (Predicate.isAlwaysTrue())
    continue;

  if (Predicate.isImmediatePattern()) {
    InsnMatcher.addPredicate<InstructionImmPredicateMatcher>(Predicate);
    continue;
  }

  addBuiltinPredicate(...);

  if (Predicate.hasGISelPredicateCode()) {
    InsnMatcher.addPredicate<GenericInstructionPredicateMatcher>(Predicate);
    continue;
  }

  return failedImport("Src pattern child has predicate (" +
                      explainPredicates(Src) + ")");
}

the code that was extracted into addBuiltinPredicate() then does return wherever it currently does continue.

The loop I am referring to is for (const TreePredicateCall &Call : Src->getPredicateCalls()) { at line 3604 in the original code.

I'm not sure we're using the same line numbers here. 3604 is:

continue;

in the original code and

0, MemoryVsLLTSizePredicateMatcher::EqualTo, 0);

in the new code according to phabricator.

rebase + address comments

madhur13490 marked an inline comment as done.Jul 27 2020, 8:56 AM
madhur13490 added inline comments.
llvm/utils/TableGen/GlobalISelEmitter.cpp
3813

Yeah, we're not referring to same line numbers :)
But anyway as we're on the same page, I updated the change to reflect.

@arsenm , Can you please take over? I think @dsanders is on vacation.

arsenm added inline comments.Aug 11 2020, 12:42 PM
llvm/test/TableGen/ContextlessPredicates.td
14

Don't need dbgs() << here (not that you really need anything)

llvm/utils/TableGen/GlobalISelEmitter.cpp
3714

How is changing this to return any different than continue? It's still treating all of these as mutually exclusive

3819

No else after continue

madhur13490 added inline comments.Aug 18 2020, 8:38 AM
llvm/utils/TableGen/GlobalISelEmitter.cpp
3714

It is returning to the parent createAndImportSelDAGMatcher which further adds custom predicate.

rebase + address comments

fix formatting

fix unexpected tabs

arsenm accepted this revision.Aug 18 2020, 9:43 AM

LGTM. I think this is only a half solution, and it's still a problem that multiple builtin predicates will be silently ignored

llvm/utils/TableGen/GlobalISelEmitter.cpp
3714

OK, so this is *only* fixing custom predicate + builtin predicate. It is not fixing adding multiple builtin predicates at once. I think that can be a separate patch, but is just as broken

This revision is now accepted and ready to land.Aug 18 2020, 9:43 AM

LGTM. I think this is only a half solution, and it's still a problem that multiple builtin predicates will be silently ignored

I agree. Will fix soon.

This revision was landed with ongoing or failed builds.Aug 19 2020, 12:54 AM
This revision was automatically updated to reflect the committed changes.